Project Story - EchoCoach: Your Private AI Speaking Coach
Inspiration
Public speaking is a critical skill, but many struggle due to lack of access to affordable, personalized coaching and feedback. Most solutions rely on cloud processing which raises privacy concerns and requires internet access. I was inspired to build a completely offline, privacy-first speaking coach that provides instant, actionable feedback using Google Chrome's built-in AI capabilities, empowering everyone with professional speech coaching tools on their own device.
What it does
EchoCoach is a Chrome extension that allows users to:
- Record speeches up to 30 seconds directly in the extension
- Input a script and get word-to-word alignment between spoken words and the script
- Receive detailed AI feedback on filler words, confidence, speaking pace, and key themes
- View delivery strengths and areas for improvement
- See a polished professional rewrite of their speech
- Get personalized coaching tips tailored to their speech performance All processing happens offline using Chrome's Gemini Nano AI, protecting user privacy with no data leaving the device.
How we built it
The project is built as a Chrome extension using:
- JavaScript for UI and logic
- Chrome's Built-in AI APIs: Prompt API (LanguageModel), Proofreader, Rewriter, and Writer APIs
- Web Audio API for microphone recording in WebM format
- Local Chrome storage for session data persistence
- Careful error handling and fallback mechanisms to ensure seamless user experience even if some APIs are unavailable
We leveraged the new Prompt API for the core speech analysis and designed custom prompts for each feature. We used the Proofreader, Rewriter, and Writer APIs for grammar checking, polishing, and personalized tips respectively, requiring origin trial tokens for full functionality. The on-device processing meets privacy and offline-first requirements.
Challenges we ran into
- Integrating Chrome's experimental built-in AI APIs was complex, requiring understanding of origin trials, tokens, and browser flags.
- Handling audio transcription is still experimental, so we designed an approximate method to analyze speech content using script input initially.
- Ensuring privacy with 100% offline processing meant we could not use cloud AI services, adding to integration complexity.
- Displaying detailed and user-friendly feedback required multiple fallback strategies in case of unavailable API functions.
- Debugging asynchronous AI calls and parsing JSON responses robustly took significant iteration.
Accomplishments that we're proud of
- Delivering a fully functional MVP that works 100% offline, respecting user privacy.
- Seamlessly integrating four distinct Chrome built-in AI APIs into one coherent application.
- Designing and implementing a unique word-to-word script alignment feature with rich error classification.
- Creating an intuitive, modern UI that showcases detailed AI feedback in real-time.
- Handling complex asynchronous logic with multiple fallbacks to guarantee stable user experience.
What we learned
- How powerful on-device AI can be, enabling sophisticated natural language processing without cloud dependencies.
- The intricacies of Chrome's experimental AI web APIs and origin trial system.
- Best practices in UX design for feedback-heavy AI applications.
- Effective strategies for error handling and graceful degradation in API-dependent applications.
What's next for EchoCoach - Your Private AI Speaking Coach
- Implement real-time speech-to-text transcription inside the extension for analysis without dependency on script input.
- Enhance session history and progress tracking for longitudinal user improvement.
- Add support for multiple languages and accessibility features.
- Enable exporting feedback reports and integration with popular document platforms like Google Docs.
- Develop more advanced AI-driven coaching metrics like emotion and intonation analysis.
- Continue refining UI based on user testing and feedback.
_EchoCoach aims to democratize access to professional speaking coaching by combining advanced Chrome AI tech with a privacy-first, offline approach, making confident communication accessible for all.
Built With
- api
- css
- gemini
- html5
- javascript
- prompt
- proofreader
- rewriter
- writer
Log in or sign up for Devpost to join the conversation.