Note: Submitting to LifeHacks theme of Hidden Leaf Hackathon

Inspiration

In a recent event, I met someone who was diagnosed with both ADHD and Childhood Onset Fluency Disorder (Stuttering). As a result, whenever he was short of words, he would panic and as a result start stammering. I discussed the issues that he faced in such a situation and realized, I can use LLMs to help him and others who faced similar problems.

What it does

StutterEase is a real-time speech coaching assistant designed to support individuals who face challenges due to speech disorders, particularly Childhood Onset Fluency Disorder (Stuttering). It enables users to engage in natural, guided conversations where they can speak freely and receive contextual assistance when they get stuck.

  • When an individual stammers, the main issue is that their mind gets clouded and they struggle to find the appropriate next words. StutterEase helps them overcome this through Speech Assistance — a feature that allows users to record real-life conversations (Your audio is used only for real-time assistance and is never stored or retained in any form). The audio is transcribed using Faster Whisper, and when a stutter is detected, the transcription is passed to an LLM to generate next-word suggestions, which are then provided to the user.

  • Lack of confidence is another challenge. To help build confidence, StutterEase offers Conversation Coaching — a feature that lets users engage in scenario-based, natural language conversations with AI. Transcription and Text-to-Speech are used to simulate real-life audio interactions, making practice more immersive and realistic.

How I built it

  1. App Interface: The app interface is built using Expo.
  2. Backend API: The API is built using FastAPI.
  3. Speech Recognition and Streaming: Used 'react-native-audio-recorder-player' for recording audio and it was streamed using WebSockets.
  4. Transcription: The streamed audio is transcribed using 'faster-whisper'.
  5. LangChain: Used for all LLM related tasks such as suggesting next words and holding conversation with the user.
  6. Text-to-speech: Used 'expo-speech' to render LLM responses in audio format.

Challenges I ran into

While building this application, a major roadblock was latency. If the next word suggestions weren't quick enough then they wouldn't be of any help to the user. In order to ensure near real-time responses, StutterEase uses WebSockets and Asynchronous Handling of requests as much as possible. All operations that aren't crucial to provide a response to the user are handled after the request is answered.

What's next for StutterEase

The idea behind StutterEase is to create an ecosystem to help people who encounter various problems due to speech disorders. Next step in this process is to integrate clinically proven analysis methods and metrics to enhance progress tracking for users.

Built With

Share this project:

Updates