Everyone Subtitle

Why We Built It We made a product so deaf people can communicate easily with everyone around them. A lot of AI assistants feel robotic and detached. We wanted something that feels human — like your new voice, not just a generic app.

What It Does Live subtitles – Word-by-word transcription as you speak (fast, AssemblyAI real-time). Smart replies – OpenAI generates responses in your style, not canned chatbot text. Your voice – Choose from AI voices (Sara, Kim, Ema, Alex) and preview: “Hey, I am {Your Name}.” Personality profile – A fun 30-question quiz builds a profile so replies sound like you. Seamless flow – Speak → see captions → pick a response → it speaks for you.

How We Built It UI/UX: Flutter + GetX (reactive, smooth, minimal boilerplate). Speech-to-text: AssemblyAI WebSocket streaming (handles partial + final results, micro-batching). AI brains: OpenAI Chat Completions for personality-aware responses, OpenAI TTS for speech. Audio: just_audio for playback, fallback to flutter_tts for reliability. Backend: Firebase Auth + Firestore for storing quiz results and user profiles. Secrets: .env via flutter_dotenv — no hardcoded API keys.

Challenges Realtime streaming – Avoiding flicker between partial and final captions. Prompt design – Making sure replies stick to the user’s identity without hallucinating details. State pitfalls – Fixed Obx scope issues, double-tap pause crashes, and async “late callback” bugs. UX polish – Removed unnecessary loading placeholders; kept layouts consistent even with long captions.

Accomplishments Subtitles that update live, word by word — feels natural, not laggy. Personality-aware replies that sound human and aligned to the user. Voices that actually say your name in your chosen style. Login flow goes straight to action, no blockers. Clean secret handling — no leaks, no exposed keys.

What We Learned How to keep streaming audio stable with token lifecycles and micro-batching. The quirks of audio across iOS/Android and why fallback hierarchies matter. Solid GetX patterns (keep reactive reads scoped, avoid re-entrancy). Prompt engineering for style and safety. UX is won in the small details: button states, text consistency, smooth routing.

What’s Next Multilingual: Add more languages with auto translation. Voice upgrades: More voices, better prosody, and optional voice cloning (opt-in). Memory: Conversation history with adaptive personality over time. Growth: Public beta, feedback loops, and partnerships with accessibility organizations.

Built With

Share this project:

Updates