Inspiration

Traditional physical therapy check-ins are often rigid, form-based, and fail to capture the nuance of a patient's true state of mind. We wanted to change that by creating an intelligent copilot that replaces clinical surveys with natural, voice-driven interactions - one that listens not just to what patients say, but how they say it.

What it does

Caden is a voice-enabled physical therapy copilot designed to bridge the gap between sessions. Patients interact with Caden via natural voice check-ins, and behind the scenes, Caden listens for emotional undertones - stress, avoidance, confidence, hesitation - and intelligently cross-references this with the patient's existing therapy plan, updating clinical hypotheses and tracking exercise progress in real-time.

How we built it

  • Voice Intelligence: Integrated Modulate's Velma-2 API to process audio blobs, providing batch Speech-to-Text inference and deep per-utterance emotion detection.
  • Agentic Brain: Powered by the Mastra agent framework, which manages custom tools enabling the LLM (Gemini Flash 3) to query patient context, synthesize new therapeutic hypotheses, and write to long-term memory.
  • Data Layer: Engineered a robust 16-table PostgreSQL schema managed by Drizzle ORM.
  • Infrastructure: Initially built on Google Cloud Platform (Cloud Tasks, Cloud Run), then pivoted mid-hackathon to Supabase and Vercel to accelerate deployment and local iteration.
  • Frontend: Built on Next.js with a mobile-first UI driven by strictly typed Tailwind CSS tokens (e.g., surface-1, accent-base) to enforce a clinical yet warm aesthetic.

Challenges we ran into

  • The Infrastructure Pivot: Halfway through, we recognized our GCP architecture would bottleneck MVP iteration, and executed a high-pressure pivot to the Vercel/Neon ecosystem while keeping our backend schema intact.
  • Scope: The user and patient interactions and UX increased the scope which made it difficult to deliver within the four hours we had, especially with the infrastructure issues we had.

Accomplishments that we're proud of

  • True Emotion Detection: We successfully moved beyond basic STT - Caden actually captures how a patient is feeling which feeds into smarter agent actions and memory management.
  • The UI Polish: We broke away from the "boring medical app" trope. The frontend feels clean, space-efficient, and premium, especially the horizontal mobile-first schedule planner.
  • Complex Full-Stack Integration: Gluing together a bleeding-edge voice API (Modulate), an autonomous agent framework (Mastra), and a heavy relational database (Drizzle) within a 4-hour window — and making the end-to-end flow actually work.

What we learned

  • Knowing When to Pivot: Exercise adherence tracking wasn't the point - the real clinical value was in capturing directional signals from the agent to give physical therapists actionable guidance.
  • Reducing Cognitive Load Through UI: Surfacing session context in a drawer rather than across multiple tabs let patients jog their memory without disrupting the check-in flow.
  • Simplifying the PT Decision Surface: Therapists shouldn't have to process more data - just approve the agent's plan or intervene. Centering the UI around that single choice made the PT handoff seamless.

What's next for Caden

  • Real Therapist Handoff: Refining escalation heuristics so physical therapists are alerted precisely when a patient flags high hesitation or pain through their voice check-ins.
  • Wearables Integration: Fusing vocal emotion signals with wearable hardware data (e.g., Apple HealthKit, Garmin) to provide a complete multi-modal view of a patient's recovery trajectory.
  • Cross patient, cross physical therapist learning: improve learning across both sides through anonymized training data

Built With

Share this project:

Updates