The Inspiration
Every child has asked the same question: "What if my toys could talk back?" Inspired by the timeless wonder of childhood imagination and the desire to create a safe, educational, and deeply engaging digital space, we built Keedztoi. We wanted to move away from the "infinite scroll" of modern media and back toward the "infinite play" of the toy box, empowered by the reasoning and creative capabilities of Gemini.
The Build
Keedztoi is a React-based frontend application designed with a "squishy," kid-centric aesthetic.
- The Soul Forge: We use
gemini-3-pro-previewwith highthinkingBudgetto analyze a child’s simple description (e.g., "a blue dragon who loves pancakes") and expand it into a rich, consistent personality profile. - The Manifestation: We utilize
gemini-2.5-flash-imageto generate high-fidelity 3D avatars that match the child's vision. - The Conversation: The heart of the app is the Multimodal Live API (
gemini-2.5-flash-native-audio-preview-12-2025), which handles low-latency, bidirectional voice interaction. We process raw PCM audio streams at 16,000 Hz for input and 24,000 Hz for output to ensure a human-like, snappy dialogue. - The Cinema: Using
veo-3.1-fast-generate-preview, we allow children to "direct" their toys in short animated movies, creating a sense of history and shared adventure.
Challenges Faced
The primary technical hurdle was synchronizing the audio visualizer with the raw PCM stream from the Live API. Because the API returns raw bytes without headers, we implemented custom decoding logic to feed the browser's AnalyserNode. Another challenge was prompt engineering for the "sentient toy" persona,ensuring the AI remains in character, uses simple vocabulary, and prioritizes the child's creative input over its own logic.
Log in or sign up for Devpost to join the conversation.