The Inspiration

Every child has asked the same question: "What if my toys could talk back?" Inspired by the timeless wonder of childhood imagination and the desire to create a safe, educational, and deeply engaging digital space, we built Keedztoi. We wanted to move away from the "infinite scroll" of modern media and back toward the "infinite play" of the toy box, empowered by the reasoning and creative capabilities of Gemini.

The Build

Keedztoi is a React-based frontend application designed with a "squishy," kid-centric aesthetic.

  • The Soul Forge: We use gemini-3-pro-preview with high thinkingBudget to analyze a child’s simple description (e.g., "a blue dragon who loves pancakes") and expand it into a rich, consistent personality profile.
  • The Manifestation: We utilize gemini-2.5-flash-image to generate high-fidelity 3D avatars that match the child's vision.
  • The Conversation: The heart of the app is the Multimodal Live API (gemini-2.5-flash-native-audio-preview-12-2025), which handles low-latency, bidirectional voice interaction. We process raw PCM audio streams at 16,000 Hz for input and 24,000 Hz for output to ensure a human-like, snappy dialogue.
  • The Cinema: Using veo-3.1-fast-generate-preview, we allow children to "direct" their toys in short animated movies, creating a sense of history and shared adventure.

Challenges Faced

The primary technical hurdle was synchronizing the audio visualizer with the raw PCM stream from the Live API. Because the API returns raw bytes without headers, we implemented custom decoding logic to feed the browser's AnalyserNode. Another challenge was prompt engineering for the "sentient toy" persona,ensuring the AI remains in character, uses simple vocabulary, and prioritizes the child's creative input over its own logic.

Built With

Share this project:

Updates