💡 Inspiration

We wanted to shift children from passive screen time to active creativity. Our goal was to build a "real" Magic Wizard that listens to a child's voice and instantly brings their imagination to life, turning bedtime stories into a collaborative creation process rather than just consumption.

📖 What it does

LoreLoom is a multimodal storybook platform where voice becomes reality.

  • Magic Voice Mode: Children brainstorm live with an AI Wizard using real-time voice interaction (no typing required).
  • Instant Weaving: The app generates cohesive stories with consistent characters and watercolor art based on the conversation.
  • Cinematic Magic: One click transforms static pages into 720p animations using video generation models.
  • Arcade: Auto-generates quizzes and memory games based on the specific plot to ensure comprehension.

⚙️ How we built it

The app is a React frontend powered entirely by the Gemini 3 ecosystem:

  • Gemini Live API: Handles real-time, low-latency voice interaction via WebSockets.
  • Gemini 3 Pro: Manages complex narrative logic and safety.
  • Gemini 2.5 Flash Image: Creates the watercolor artwork.
  • Veo 3.1: Generates video animations from the illustrations.
  • Gemini TTS: Provides emotive, character-distinct narration.

🚧 Challenges & Learnings

Synchronizing raw PCM audio streams from the Live API WebSocket for glitch-free playback was our biggest technical hurdle. We learned that multimodality is the future—combining voice, video, and text creates an emotional connection that text prompts alone simply cannot match.

Built With

  • gemini-3-pro
  • gemini-live-api
  • google-gemini-api
  • html2canvas
  • indexeddb
  • jspdf
  • react
  • tailwind-css
  • typescript
  • veo
  • vite
  • web-audio-api
Share this project:

Updates