Inspiration MyEcho started as a way for my daughter to hear bedtime stories in my voice when I wasn’t around. I wanted the experience to be personal, interactive, and dynamic—something that responded to her curiosity and imagination. That became MyEcho: a real-time, voice-driven storytelling engine that doubles as an interactive audiobook and RPG.

What it does 🎮 Interactive audiobook meets RPG (first of its kind) – Kids play the story using only their voice. Choices shape the plot, characters evolve, and progress carries across sessions.

🌌 RPG Modes – Choose from immersive, voice-navigated worlds:

Space Pirate – Upgrade your ship, trade loot, escape space law.

Fantasy Quest – Cast spells, build alliances, confront magical creatures.

Cyberpunk Survival – Hack, sneak, and navigate neon-lit danger.

🧬 Voice-clone narration – Upload a short voice sample to generate a high-fidelity voice model.

🧠 Real-time AI story engine – Say “Run into the cave” or “Make the dragon a friend,” and the story adapts live.

🗣️ Voice-only navigation – No screens. Kids talk, the story responds.

📤 Family-sharing mode – Remote loved ones can record and send custom story chapters using their own voice clone.

How we built it 🧠 Gemini 2.5 powers real-time story generation. A Cloud Run service manages branching logic and context continuity.

🗣️ ElevenLabs Conversational AI handles STT → NLU → TTS. It transcribes child input, interprets intent, and responds in a cloned voice.

🎤 Voice cloning via ElevenLabs Voice API.

🔧 Next.js frontend, styled with Tailwind and deployed via Vercel for edge performance.

🔄 Google Cloud Run hosts all backend services, including story orchestration, voice asset processing, and moderation workflows.

🔐 Firebase provides auth, real-time DB for story state, and Cloud Storage for voice/media assets.

💳 Stripe + RevenueCat for managing usage tiers and subscriptions.

🔁 CI/CD pipeline with GitHub Actions → Cloud Build → Cloud Run. Built to be modular and scalable.

Challenges we ran into ⚖️ Latency vs. immersion – Getting AI responses fast enough to feel conversational took tuning and optimization.

🧠 Branch coherence – We had to build custom scoring and reranking logic to keep story branches consistent and safe.

🧒 Child voice recognition – Kids have unique speech patterns. We tweaked thresholds to reduce misfires.

🛡️ Content safety – Story moderation runs in real-time to prevent inappropriate output and auto-correct unsafe branches.

Accomplishments we're proud of 🚀 MVP launched in 48 hours – Fully functional, demo-ready, and tested with real families.

🧠 LLM + voice clone integration – Built a complete, modular pipeline combining Gemini + ElevenLabs into a playable narrative system.

🧱 Clean, scalable architecture – Fully modular cloud-run architecture means new voices, story modes, or AI engines can drop in easily.

What we learned 🎙️ Voice = presence – Cloned narration from a parent feels more real than any studio-grade voice.

🧒 Natural language over buttons – Letting kids talk to the story made interaction more intuitive and fun.

🧩 Cloud-native pays off – Microservices and clean separation of concerns made rapid iteration easy.

What’s next for MyEcho 🌍 Multilingual support + translation – Spanish and Arabic first, with narration in the parent’s voice.

🧭 Expanded Game logic – Trivia/r 📊 Parent dashboard – See story themes, vocabulary use, and what choices your child makes.

🏫 School & therapy edition – Custom tools for education, literacy, and neurodivergent communication.

Built With

Share this project:

Updates