Inspiration
SpeakEasy started with a simple problem: speech assessments can be hard to access, expensive, inconsistent, and difficult to track over time.
We were especially motivated by rare disease and neurological communities, where speech changes are often one of the earliest and most frustrating symptoms. Conditions like ataxia often rely on clinician-rated scales such as SARA, where speech is judged during short in-person visits. Patients may travel long distances, wait weeks between appointments, and still leave without a clear way to measure progress.
We wanted to build something that gives patients more clarity, providers better data, and both sides a smoother path to care.
What it does
SpeakEasy turns a short voice session into useful, measurable insights.
Users complete three guided tasks:
- Read a short sentence
- Repeat syllables for rhythm and motor speech testing
- Have a short natural conversation
From those recordings, SpeakEasy evaluates five core areas:
- Fluency
- Clarity
- Rhythm
- Prosody
- Pronunciation
The platform then provides:
- A composite speech score
- Visual charts and progress trends
- Personalized strengths and focus areas
- A clinician-ready PDF report
- Session history tracking
- A guardrailed ElevenLabs voice assistant that explains results and gives coaching based only on validated data
How we built it
We built SpeakEasy as a layered AI healthcare system.
Frontend
- React
- Vite
- Tailwind CSS
- Browser recording with the MediaRecorder API
Backend
- FastAPI for API orchestration
- faster-whisper for transcription
- librosa + parselmouth for feature extraction
- reportlab + matplotlib for reports and charts
AI Agent Architecture
We used uAgents + Agentverse to coordinate specialized agents.
Assessment Agent
Receives speech metrics from the backend.
Report Agent
Turns technical results into clear summaries and clinician-ready PDFs.
Progress Tracker
Compares new sessions with historical sessions to spot trends over time.
Therapist Agent
Builds structured prompts for an ElevenLabs voice assistant that explains results and gives personalized coaching.
Safety Guardrails
The voice assistant is limited to the data it receives. It cannot invent diagnoses, give unsupported medical advice, or go beyond the validated results.
Challenges we ran into
1. Turning speech into useful metrics
Speech quality is complex. We had to identify signals that were both measurable and meaningful, including:
- Words per minute
- Pause frequency
- Pronunciation confidence
- Pitch variation
- Rhythm consistency
Many ideas worked in theory but became noisy on everyday microphones.
2. Avoiding misleading AI outputs
Because this is healthcare-adjacent, safety mattered from the start. We built strict limits so the assistant could support users without pretending to replace a clinician.
3. Making results feel human
Numbers alone are not enough. We wanted people to feel informed and encouraged, so we focused on visuals, plain-language explanations, and progress tracking.
4. Coordinating multiple agents
Getting several agents to reliably handle reporting, history comparison, and coaching required careful system design and dependable message passing.
What we learned
We learned that strong healthcare technology is not only about model accuracy.
It is also about trust, accessibility, usability, and empathy.
Users do not just want scores. They want answers to questions like:
- Am I improving?
- What should I work on next?
- Can I share this with my provider?
- Do I have a clearer path forward?
We also learned that multi-agent systems work best when each agent has a clear role instead of asking one model to do everything.
What's next
We see SpeakEasy growing into:
- Remote monitoring for speech therapy patients
- Neurological condition progress tracking
- Early speech screening for underserved communities
- Public speaking and education coaching
- Provider dashboards for long-term review
Our long-term vision is simple:
Healthcare should not begin only when someone reaches the clinic. It should begin wherever they live, speak, and grow.
We see SpeakEasy evolving beyond one-time assessments into a personalized rehabilitation platform where each session helps guide the next. Exercises in fluency, pronunciation, rhythm, and prosody could be adapted based on user progress, creating a continuous cycle of assessment, coaching, and support.
Built With
- agentverse
- elevenlabs
- fastapi
- gemma
- google-ai-studio
- librosa
- matplotlib
- postgresql
- praat
- python
- react
- reportlab
- supabase
- tailwind-css
- typescript
- uagents
- vite
- whisper
Log in or sign up for Devpost to join the conversation.