Inspiration
Stuttering therapy is often expensive, difficult to access, and emotionally overwhelming. Many people want a private and judgment free way to practice speaking without pressure. This inspired us to imagine a system that offers support at any time using the power of artificial intelligence. Voiceflow AI was created to provide a calm and empowering space for individuals who stutter.
What it does
Voiceflow AI offers real time stuttering detection including blocks, prolongations, and repetitions. It provides AI based speech assessment and fluency scoring. It tracks progress with clear visual analytics. It generates personalized exercises using LLM and RAG methods. It provides supportive strategies and coping recommendations tailored to each user’s speech patterns. All of this happens in a private environment that encourages consistent practice.
How we built it
The frontend is created using React with a focus on simple and calming interaction. It includes real time audio capture with waveform and spectrogram visualization. The backend uses Python FastAPI for audio processing and model inference. A SQL database hosted on Google Cloud stores session information and analytics. The intelligence layer uses speech models for disfluency detection, an acoustic analysis pipeline for segmentation, a Gemini based LLM for personalized guidance, and a RAG system that draws from curated therapy aligned information.
Challenges we ran into
Detecting true disfluencies while ignoring noise, breaths, and artifacts required careful tuning. Providing real time feedback demanded strict latency optimization. Ensuring that the guidance felt supportive and clinically appropriate took several design iterations. Creating a user experience that feels warm and empowering rather than clinical required thoughtful design choices. Managing different microphone qualities and environments added complexity to the processing pipeline.
Accomplishments that we're proud of
We built a functioning real time stuttering detection system. We created a meaningful speech analytics dashboard that reflects user progress. We successfully combined signal processing with LLM reasoning through RAG. We produced a supportive user experience that goes beyond technical accuracy. We observed the system accurately identifying distinct types of stuttering patterns. We built something that can genuinely help people feel more confident speaking.
What we learned
We gained experience with real time audio analysis in real environments. We learned about stuttering patterns and therapy approaches. We learned how to design AI systems that combine signal processing with language models. We discovered the importance of supportive and emotionally aware UX in tools that relate to personal challenges. We learned how sensitive detection thresholds are in speech analysis systems.
What's next for Voiceflow AI
A dedicated mobile application for easier daily practice. A live conversational coaching mode where users interact directly with an AI companion. More advanced acoustic models that adapt to individual speaking styles. Gamified progress milestones that make practice more rewarding. A therapist collaboration feature for securely sharing progress. An offline mode for privacy focused speech analysis.
Log in or sign up for Devpost to join the conversation.