Inspiration

My mom works for a Tele-Nurse and one day she was telling me about a person with hearing problem was on the other end. They used an interpreter but the call took super long. Thankfully the person only had a minor cold but I started to wonder if there was a situation that is dire and there wasn't an interpreter around, what would happen?

What it does

ASOwl is an AI-powered triage nurse that "sees" and "understands" patients. ASL Recognition: Uses your webcam to interpret hand gestures and fingerspelling into real-time text. Face Analysis: Employs computer vision to detect facial expressions, mapping them to clinical pain levels and identifying critical indicators like facial asymmetry or distress. The Triage AI (ASOwl): A conversational AI assistant that conducts a medical intake interview based on the signs it detects. Urgency Classification: Automatically generates a structured medical summary including an urgency level (Low to Critical) and symptom notes for the doctor.

How we built it

We built ASOwl with a high-performance stack: Google MediaPipe for ASL gestures and FaceBlaze for ultra-fast face detection. Google Gemini 1.5 Pro serves as the diagnostic intelligence, processing visual inputs into structured medical summaries. The application is built on React and Vite, styled with Tailwind CSS, and backed by Node.js, PostgreSQL, and Drizzle ORM. We leveraged Google Antigravity as our agentic coding partner to accelerate the end-to-end implementation and UI polish.

Challenges I ran into

The biggest hurdle was the performance of running dual-vision models (hand and face) simultaneously while maintaining a conversational AI state in a solo project. I had to optimize the landmark drawing and data processing loops to ensure the camera feed stayed smooth while the AI processed the patient's inputs. Tuning the medical prompts to ensure the AI stayed professional and empathetic was also a delicate balancing act.

What I learned

I learned a massive amount about the nuances of computer vision—how light and positioning affect hand landmark accuracy and how difficult it is to quantify "pain" through code. I also gained a much deeper understanding of how to use LLMs as structured data processors rather than just simple chatbots. I also learned how cameras can detect hand positioning and find specific points like the joints on a hand.

What's next for ASOwl

The next step is moving beyond fingerspelling to full ASL sentence recognition. I also want to integrate native biometric support (using phone sensors to detect heart rates) to combine with the vision data, creating a truly all-in-one digital triage clinic that can be deployed in rural areas or during emergency field responses or even at home.

Share this project:

Updates