Inspiration

We noticed that most "AI for Kids" applications fall into two boring categories, text based chatbots that are too complex for children, or pre-scripted games that lack imagination. We wanted to build something in the middle, an infinite storybook that feels like a cinematic audio drama. We were inspired by the idea of a "Digital Dungeon Master" for bedtime stories an AI that doesn't just read text, but acts it out, changing its voice and appearance instantly to match the character speaking.

What it does

Adventure Flip is an interactive storytelling engine where the story comes alive visually and audibly.

  • Choose a Theme: The child selects a path (e.g., A Dragon's Cave, A Magical Forest).
  • Real-time Drama: The AI generates a story segment, but instead of a monologue, it creates a script with multiple actors.
  • Dynamic Casting: When the Narrator speaks, the voice is calm and the screen shows a storybook. When the Dragon roars, the screen "flips" to show the dragon, and the voice instantly deepens and growls.
  • Interactive Choices: The story pauses for the child to speak their decision, guiding the plot in infinite directions. -While most AI apps focus on English, Adventure Flip leverages ElevenLabs Multilingual v2 to bring high-quality, expressive storytelling to underrepresented languages like Indonesian.

How we built it

  • We moved away from simple chatbot wrappers and built a Custom Director Engine to orchestrate the experience:
  • The Brain (Google Gemini 2.5 Flash): We didn't just ask Gemini for text. We engineered a strict System Prompt that forces Gemini to output Structured JSON. This JSON separates the story into "segments," identifying the speaker, text, and emotion for every line.
  • The Voice (ElevenLabs API): This is where the magic happens. We implemented a Dynamic Voice Tuner.
  • For the Narrator, we set high stability (0.7) for a clear, audiobook-like experience.
  • For the Dragon, we lowered the stability (0.3) and increased style to generate unpredictable, gravelly, and dramatic creature voices.
  • The Orchestrator (Node.js & Express): Our backend acts as a traffic controller. It parses the JSON script from Gemini, generates audio for each line in parallel using the specific ElevenLabs voice ID for that character, and bundles it back to the frontend.
  • The Stage (React & Vite): The frontend implements a "Playback Queue" system. It plays the audio segments sequentially while triggering CSS animations to swap the character avatar (The "Flip") exactly when the audio begins.

Challenges we ran into

  • The "Amnesia" Problem: Initially, the AI would forget details (e.g., the Dragon turning green). We solved this by injecting a "Character Sheet" into every prompt to enforce consistency.
  • The Never-Ending Story: The AI would often ramble on forever. We implemented a Turn-Based Logic system. We pass a turnCount to the backend; if it's turn 9, the prompt instructs the AI to force a climax. If it's turn 10, it forces an ending.
  • Visual-Audio Sync: Getting the avatar to change exactly when the voice changed was tricky. We solved this by splitting the generation into atomic segments and playing them via a strict queue in React.

Accomplishments that we're proud of

  • Successfully implementing Multi-Voice Synthesis in a single story flow.
  • Creating a "Safety Guardrail" that keeps stories child-friendly while allowing for exciting conflict.
  • Building a responsive UI that feels like a polished game using simple CSS animations (breathing/talking effects).

What we learned

  • Prompt Engineering is Software Engineering: We learned that getting reliable JSON out of an LLM requires rigorous prompt structuring and fallback error handling.
  • Voice Design Matters: We discovered how powerful ElevenLabs' stability and similarity_boost parameters are. Tweaking a slider by 10% can be the difference between a "Robot" and a "Monster".

What's next for Adventure Flip

  • Background Music: Integrating AI music generation to match the mood (scary music for dragons, sparkling music for fairies).
  • Latency Reduction: upgrading our Custom Backend to use WebSockets for faster response times.
  • User Library: Allowing kids to save their favorite stories to replay later.

Built With

Share this project:

Updates