Pebble

The first-ever Pebble

Inspiration

30% of US teens use AI chatbots daily. A third of them prefer talking to AI over talking to people for personal conversations. Common Sense Media found that every major chatbot fails to appropriately handle mental health conditions in young people. A 14-year-old died by suicide after forming an emotional bond with an AI companion. 83% of parents think their children's mental health is getting worse.

Kids are already using AI unsupervised, at scale, and nothing out there was built for them. Parents have no visibility. No control. The $70 billion tutoring market is still gated by cost, and the "kid-safe" AI tools that exist are dumbed-down wrappers that children outgrow fast.

Pebble started as a voice-first study companion for college students. A dedicated Android device, Canvas LMS integration, on-device speech recognition, cloud inference for tutoring. That was the plan going into TreeHacks, and for most of the 36 hours, that's what I built.

Then, past 1 AM on Sunday, with submissions due at 9:30 that morning, I pivoted to children. I added a full parent management system, a real-time web dashboard, age-calibrated guardrails across four tiers (elementary, middle, high, college+), content blocking, device locking, and a multi-layered safety architecture. The college experience stayed intact as the top tier, but the product became something bigger: an AI learning device that parents can actually trust.

How I built it

Solo build. Three components, 36 hours: a Node.js/TypeScript backend, a native Kotlin/Jetpack Compose Android app, and a React parent dashboard.

The Android device runs ONNX Whisper for speech-to-text entirely locally. No child's voice ever leaves the phone. But I didn't try to force the device to do more than it should. The real inference (Socratic tutoring, safety evaluation, tool orchestration) runs through foundation models in the cloud. Claude (Haiku) handles primary chat, with delegation to Sonnet and Opus for harder tasks. The system also routes to OpenAI (GPT) and Google (Gemini) via OpenRouter, so the agent picks the best model for a given task. Parents can select their child's preferred provider from device settings. Regardless of which model is working, all safety guardrails are enforced before any delegation happens.

The agentic loop itself is built on Anthropic's Claude Agent SDK, which manages the full tool-use cycle, subagent delegation, and streaming. All 20+ Pebble tools (Canvas data, web scraping, study sessions, quizzes, persistent memory) are exposed to the SDK via an MCP server, and custom hooks bridge SDK events into SSE for the Android client in real-time. The AI teaches through active recall and Socratic questioning, never just giving answers. Canvas integration went smooth thanks to Laura Schauer's OpenAPI spec.

Safety

This is the part I care about most. If you're building AI for children, safety can't be a system prompt disclaimer. It has to be architectural.

Pebble uses three layers. First, heuristic detection: after every response, the backend scans for redirect phrases as a fallback, catching safety events the model didn't explicitly flag. Second, an explicit tool (flag_safety_redirect) the model can call to formally log when it redirects away from a blocked topic, with structured metadata that flows to the parent dashboard. Third, emergency lockout: for critical violations, the model calls emergency_lock_device, which immediately locks the device, flags the conversation for parent review, and pushes a real-time alert via SSE. No further interaction until a parent unlocks it.

Every interaction is also shaped by age-calibrated guardrails. Parents can block custom topics and set daily time limits. And the parent web dashboard shows every conversation, every study session, every safety redirect, plus AI-generated weekly summaries, mastery tracking, Canvas grades, and proactive alerts. The dashboard isn't a token gesture. It's the product's other half.

Challenges

SSE streaming with agentic tool-use loops was the hardest part. The model calls multiple tools mid-response, and the client needs to know what's happening at every step. I built incremental streaming that emits tool status, text deltas, heading tags, quiz payloads, study session updates, and safety events over a single connection. Most of the bugs I squashed during my all-nighter lived here: the model starting a study session but forgetting to quiz (solved with continuation nudging), text deltas arriving mid-tag, the client disconnecting mid-tool-loop. Each one was small, but they stacked up.

The kid pivot at 1 AM was its own challenge. Not because the code was hard, but because the scope was enormous. Parent accounts, a web dashboard, real-time alerts, device locking, activity logging, age-calibrated prompts, and a safety architecture I could stand behind. All in under eight hours.

What I learned

The right abstraction boundaries make ambitious projects possible. On-device STT, cloud inference. Architectural safety, not prompt-level. The parent dashboard as a first-class component, not a bolt-on. Each decision made the system easier to reason about, even at 4 AM when nothing was compiling.

Building for kids forces you to think harder. When your user is a 7-year-old, "it usually works" isn't good enough.

Built With

claude-agent-sdk
elevenlabs
gemini
kotlin
node.js
openai
postgresql
render
typescript

Updates

Zane St. John started this project — Feb 15, 2026 12:24 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.