Inspiration

Students with limited hearing or deaf struggle to keep up with fast paced lectures and international students face language barriers in classrooms. I wanted to create an accessible tool that captures spoken content in real-time and breaks down language barriers through instant translation.

What it does

SpeakEasy is a real-time speech-to-text captioning app with live translation capabilities. It uses an AI agent to be able to gather the captions and convert them into beautiful notes, and even uses a chatbot so you can as questions based on your notes from classes.

How we built it

We built SpeakEasy using React and TypeScript for a responsive frontend, integrated the Web Speech API for real-time speech recognition, and connected Google Cloud Translation API for multi-language support. We implemented React Context for state management, custom hooks for speech recognition and audio monitoring, and local storage for session persistence.

Challenges we ran into

Getting the translation API to trigger correctly was tricky - we initially confused UI language settings with caption translation settings. We also had to handle microphone permissions across different browsers, manage real-time state updates without causing performance issues, and ensure translations happened fast enough to feel instant to users.

Accomplishments that we're proud of

proud of achieving truly real-time translations that appear almost instantly alongside the original captions. The clean, retro-pixel UI makes the app both functional and visually appealing. We successfully implemented a full session management system that saves, displays, and allows users to download their notes.

What we learned

We learned how to work with browser APIs like Web Speech Recognition, handle asynchronous API calls efficiently, and manage complex state in React applications. We also gained experience with proper API key security, debugging translation services, and creating accessible interfaces that work across different languages.

What's next for SpeakEasy

We plan to add support for more languages, implement AI-powered summaries of captured sessions, add collaboration features so multiple users can view the same live captions, and optimize for mobile devices. We'd also like to add speaker identification and custom vocabulary for technical terms in specific subjects.

Built With

Share this project:

Updates