Inspiration
Therapists often see patients once a week (or less), but emotional state can change every day. Between sessions, important signals get lost in text check-ins or never get shared at all. We built Sessional to close that gap: give patients a low-friction way to send voice updates, and give therapists a fast, structured view of what needs attention before the next session.
What it does
Sessional is an asynchronous voice check-in platform for mental health care.
- Patients record a short voice memo in a guided flow.
- The system analyzes both:
- what is said (semantic sentiment), and
- how it is said (prosody / vocal affect).
- It detects possible divergence moments (e.g., “I’m fine” words with stressed tone).
- Therapists get a clinician-facing brief with:
- 60-second snapshot
- risk level
- key themes
- divergence timeline with timestamps
- suggested opening questions
- trend indicators vs prior check-ins
How we built it
- Frontend: Next.js + React + TypeScript + Tailwind
- Backend: FastAPI + Pydantic + SQLAlchemy
- Database: PostgreSQL (including audio storage for local MVP)
- Auth: Email/password + JWT + role-based access (patient/clinician)
- AI pipeline:
- Hume AI for utterance-level prosody + language sentiment
- Gemini for structured, evidence-grounded brief synthesis
- Quality controls: schema-constrained outputs, guardrails, snippet normalization, fallback behavior when external APIs fail
Challenges we ran into
- Designing a clinically useful output without over-claiming or hallucinating
- Balancing summary richness with strict evidence-grounding
- Handling real-world audio variability and short/non-ideal utterances
- Integrating multi-step async processing (upload → extraction → synthesis) with clear UX
- Keeping role-based routing/auth secure while moving quickly in MVP mode
- Making divergence explanations understandable to therapists, not just technically correct
Accomplishments that we're proud of
- Built a full end-to-end MVP from recording to therapist dashboard
- Implemented true semantic-vs-prosody divergence logic (not just keyword matching)
- Improved report quality with grounded prompts + post-generation guardrails
- Added trend logic across check-ins (risk/frequency/intensity direction)
- Shipped practical therapist UX touches: quick-view modal, full brief page, report deletion
- Added tests around normalization, divergence logic, and guardrail behavior
What we learned
- In mental health tooling, trust comes from clarity and restraint, not flashy generation
- Structured outputs + repair layers significantly reduce hallucination risk
- Small UX details (wording, color semantics, snippet context) strongly affect clinician confidence
- Voice carries critical emotional information that text-only check-ins can miss
- Building modular provider layers early makes future model swaps much easier
What's next for sessional
- Longitudinal therapist views across weeks/months (trajectory over time)
- Better explainability for each risk/trend signal in plain clinical language
- Human-in-the-loop feedback to improve prompting and report quality
- Stronger privacy controls and lifecycle policies for stored audio
- EHR/workflow integrations and therapist team collaboration features
- Production hardening: background jobs, retries, monitoring, and auditability
Built With
- eslint
- fastapi
- google-gemini-api
- httpx
- hume-ai-api
- javascript
- jwt
- next.js
- npm
- passlib
- postgresql
- psycopg
- pydantic
- pytest
- python
- react
- sqlalchemy
- tailwind-css
- typescript
- uvicorn
Log in or sign up for Devpost to join the conversation.