Inspiration

It often starts with a simple question.

A child points at something in their world, like a bird, shadow, or passing car. Then they ask "why?" or "how does that work?" But those moments are fragile. Adults are busy, unsure, or default to "look it up", and the curiosity fades as quickly as it appeared.

Even when technology steps in, it doesn't quite solve the problem. Search engines and chatbots deliver instant answers, but they often bypass the thinking that made the question meaningful. Worse, they pull children away from the real world and onto a screen, separating them from the very thing they were curious about.

At the same time, guidance from the American Academy of Pediatrics (AAP) has shifted toward a “healthy media diet” that emphasizes high-quality content, co-engagement with caregivers, and intentional design over antiquated raw screen-time limits. Seeing this shift, we want to build something that meets kids in their curiosity, keeps them grounded in the real world, and turns their questions in deeper exploration and creativity. And we created Curiosity Kids.

What it does

Curiosity Kids is a mixed-reality learning companion for children ages 5-11. A child simply points at something in their real world, asks "why?" PiPi, an animated AI guide, responds with age-appropriate explanations and gentle, Socratic prompts that encourage thinking rather than just giving answers.

The experience centers around two core interaction loops. In the first, children can ask about anything they see. The system identifies the most interesting learning angle, PiPi gives a short answer and nudges them toward asking follow-up questions.

In the second, children become creators. They build simple three-scene stories by choosing subjects, actions, effects, and locations. With parent approves it, the app animates their ideas using 3D models, allowing their imagination to play out in front of them.

Curiosity Kids runs on Snap Spectacles and in any web browser, so families can get started without specialized hardware. The web app also powers voice input, stores questions and stories, and tracks a "Curiosity Quotient" alongside learning mini-games like Pattern Grid, Number Climb, and Knowledge Garden. A Parent Hub gives caregivers full visibility into questions and stories, age controls, and approval over generated content.

How we built it

Curiosity Kids is built across three connected surfaces: a web app, an AR experience on Snap Spectacles, and an AI backend.

The web app (Next.js) includes question input, 3D story playback, child profiles, quantified progress-tracking, educational minigames and the parent dashboard.

On Spectacles, we built an interactive AR Lens where PiPi (the platform's mascot) appears in the child’s world, using gaze, gestures, and voice input to trigger interactions and anchor mesmerizing 3D “Story Theater” scenes in real space.

Behind the scenes, a FastAPI backend powers the AI storymaking pipeline which integrates a library of over 300 3D models, 30+ shaders/VFX, and dozens of animations and stage materials into a medium that provides non-repetitive, memorable storytelling. Other parts of the backend combine object detection, reasoning, story generation, safety guardrails and text-to-speech into a single flow that responds in real time.

We designed the AI stack to be both flexible and efficient. It uses open multimodal models (such as Qwen2.5-VL for vision and DeepSeek for reasoning) on a serverless, pay-per-token setup, with higher-end models reserved for premium-quality outputs. To reduce latency, story generation is built directly from the reasoning step, avoiding extra model calls.

We also prioritized reliability for the demo. The core experience includes series of offline fallbacks, so the system can continue working even without API access or a stable connection.

Challenges we ran into

Designing for a kid's real environment instead of a flat screen introduced a core challenge: identifying the single most meaningful object in a messy scene and responding quickly enough to feel interactive, while still keeping explanations age-appropriate.

Architecturing some form of generalizable story making pipeline within the Spectacles' 300 mb budget, which can, at the same time, produce visually meaningful and original outputs for any sort of question a child may have was incredibly difficult to accomplish. It involved carefully selecting assets and effects to provide breadth and meaningfulness at the same time.

Making the experience to be hardware-optional was its own challenge: the same loop had to feel natural both on Spectacles and in a web browser, with the phone acting as a microphone and processing layer for AR experience.

Cost was also a key constraint. Children tend to ask the same questions repeatedly, so we implemented semantic caching, prompt caching, and model routing to avoid redundant calls (for example, answering "why is the sky blue" once, not every time).

Finally, safety was critical. Because the product is designed for children, all stories pass through parental approval and age-tiered filtering before being shown, ensuring content remains appropriate and trustworthy.

Accomplishments that we're proud of

We built a complete, end-to-end experience. A child can look at the world around them, ask a question, hear PiPi respond, and turn that curiosity into an animated story, working seamlessly across both AR (Spectacles) and the web.

We grounded the entire design in current pediatric guidance from the start, rather than treating safety as an afterthought. Co-use with caregivers, parental oversight, short sessions, and the absence of engagement-maximizing patterns (like autoplay or infinite scroll) are integral parts of the architecture at a foundational level.

We also made the system efficient and accessible to run. By using open-model defaults, a fast path for story generation, offline fallbacks, and a bring-your-own-key option, we kept costs low while maintaining flexibility for different users.

What we learned

We learned that the hard part is not generating an answer, it is generating the right kind of answer. The most effective responses don't close the loop, they extend it. Socratic scaffolding, which is the practice of asking questions back and guiding thinking, has been proved to be core of the product, not just a feature.

On the engineering side, we found that cost and performance improve significantly when repeated questions are treated as a caching problem rather than a fresh inference each time. Designing deterministic fallbacks early also made the system far more resilient, instead of assuming models or network access would always be available.

We also gained a clearer perspective on the platform itself. AR, particularly through Spectacles are a strong fit precisely because they keep learning anchored in the child's real-world environment, rather than pulling their attention onto a separate device.

What's next for Curiosity Kids

Next up is shipping the Socratic tutor and Daily Ritual loops fully onto the Spectacles, launching paid family tiers with Stripe, and running school and library pilot programs to build early validation and partnerships.

From there, we plan to expand the learning center with more games and skill tracking, deepen accessibility for children with different sensory and language needs, and introduce shared, multiplayer Spectacles sessions so kids, parents, and caregivers can explore and learn together in the same scene.

Longer term, the goal is to become the trusted first AR app a parent installs for their child, and to keep proving that screen time can be active, curiosity-driven, and designed for connection, not isolation.

Built With

Share this project:

Updates