Inspiration

Watching Ray's grandfather slowly lose himself to dementia was the hardest thing our family ever went through. By the time he was diagnosed, the disease had already progressed significantly — and we kept asking ourselves, what if we'd caught it earlier? The Clock Drawing Test is one of the most powerful quick screens in medicine, but it requires a trained clinician to administer and score. We wanted to put that screening power into the hands of any family who's starting to worry about a loved one, no appointment needed.

What it does

ClockWise is a voice-guided AI agent that walks patients through the clinically validated Clock Drawing Test while simultaneously analyzing their drawing via computer vision and their speech patterns in real time. As the patient talks through the exercise, a live Speech Graph builds on screen — visualizing word connections as a network where healthy speech expands outward and concerning patterns collapse into tight, repetitive loops. After the session, ClockWise generates a cognitive risk score across multiple domains (circle quality, number placement, hand accuracy, spatial organization) and produces a downloadable PDF report the patient can bring to their doctor.

How we built it

We built the frontend in React with a three-column dashboard layout — voice agent and transcript on the left, a live camera feed in the center, and the Speech Graph visualization on the right. The voice agent is powered by Gemini's multimodal API, which handles both the conversational flow and the image analysis of the completed clock drawing. Speech-to-text runs through the Web Speech API, and the transcript is processed in real time to construct a directed graph using Speech Graph Analysis, rendered on a canvas element with a custom force-directed physics simulation. Drawing analysis scores are computed by feeding the captured image to Gemini with a detailed clinical scoring prompt based on the Shulman 6-point scale. The final PDF report is generated with all session data, scores, and the speech graph snapshot included.

Challenges we ran into

Getting the Speech Graph to feel alive and responsive was trickier than expected — tuning the force simulation so nodes didn't fly off screen or collapse into a single blob took a lot of iteration on repulsion, attraction, and damping constants. Prompt engineering for the clinical scoring was another challenge; we needed Gemini to be precise and consistent in evaluating drawings against specific clinical criteria rather than giving vague assessments. Balancing the voice agent's warmth with clinical standardization was a design tension we kept coming back to — too scripted feels robotic, too freeform compromises diagnostic validity.

Accomplishments that we're proud of

The live Speech Graph visualization is something we're really proud of — watching the network build in real time as someone speaks is genuinely captivating, and the contrast between a healthy graph and a concerning one is immediately visible without any medical knowledge. We're also proud of how approachable the whole experience feels. Our voice agent doesn't feel like a clinical tool — it feels like a kind companion walking you through a simple exercise. That was intentional and important to us.

What we learned

We learned that multimodal fusion — combining drawing analysis with speech biomarkers — is significantly more powerful than either signal alone. Research shows a 7-11 point accuracy improvement, and even in our prototype you can see how speech patterns reveal things the drawing alone might miss. We also learned a lot about responsible AI in healthcare: the importance of framing results as screening rather than diagnosis, the need for clear disclaimers, and how much the tone of an AI agent matters when interacting with vulnerable populations.

What's next for ClockWise

Our immediate next feature is eye tracking — using the webcam to monitor whether the patient's gaze is focused and coordinated during the drawing task, or if they appear disoriented and struggle to track between the paper and their reference points. Gaze patterns are an emerging biomarker for cognitive decline. Beyond that, we want to add longitudinal tracking so caregivers can monitor changes over time, multilingual support to reach underserved communities, and ultimately pursue clinical validation to bring ClockWise to primary care practices as a standard screening tool.

Built With

Share this project:

Updates