Inspiration

We kept noticing that the best way any of us actually learned something was by trying to explain it to someone else out loud — the moment you stumble is the moment you realize you don't really get it. That's the Feynman technique, and we wanted to build a study partner that leans into it instead of just handing you answers. So we asked: what if you could just talk through a topic, and something quietly turned your messy explanation into a clean, living set of notes as you went?

What it does

Curio is a voice-first study companion. You start talking through whatever you're learning, and it listens like an attentive tutor — encouraging you to keep going and gently asking a question when you skip over something. While you talk, it's building a live whiteboard of your explanation: structured notes, diagrams, flowcharts, and mind maps that take shape in real time. Voice is the input, the board is the output, and you end up with a study artifact you actually made yourself.

How we built it

The voice side runs on a Pipecat pipeline — Deepgram Flux for speech-to-text and semantic turn-taking, an LLM brain for the conversation, and Cartesia for the spoken replies, all streamed to the browser over WebRTC. The visual side is a tldraw whiteboard driven by a tool contract: the agent doesn't just dump text, it calls specific board actions like "add a flowchart" or "connect these nodes." We split it into two channels — one agent talks to you, while a separate caller channel watches the transcript, figures out where one topic ends and another begins, and hands each finished topic to a Structuring Agent that decides how to draw it. Next.js holds the frontend together, Supabase handles persistence, and Sentry traces the whole agent pipeline.

Challenges we ran into

Making the conversation feel human was the hardest part — getting the bot to tolerate little "mm-hmm" backchannels without interrupting itself, but still yield the moment you actually start a real thought. We also got bitten by a bunch of sharp edges: a camelCase/snake_case mismatch that silently dropped our session data on every connect, a Windows console encoding crash that killed the agent on startup over an emoji, missing STUN servers that broke audio for remote users, and a frontend bug where every error showed up as a useless "[object Object]" until we fixed how we unpacked it.

Accomplishments that we're proud of

We're proud that the conversation actually feels like talking to a patient tutor, not a walkie-talkie — the turn-taking and interruption handling came out really natural. And we're proud of the dual-channel design: keeping the "talk to the human" job and the "build the board" job as separate brains made the whole thing cleaner and let the board get genuinely smart about structure instead of just transcribing.

What we learned

We learned that real-time voice is brutally unforgiving — a few hundred milliseconds of dead air completely breaks the illusion of a conversation, so almost every decision came down to latency. We also learned how much of "good AI UX" is actually the boring plumbing: error surfacing, encoding, NAT traversal, payload contracts. And we got a real appreciation for designing the agent's relationship to the tools up front rather than bolting it on later.

What's next for Curio

Next we want to deepen the board intelligence — better diagram choices, and a "refining" animation so you can watch rough notes tidy themselves up. We'd like Curio to spot gaps and contradictions across a whole session, not just turn by turn, and eventually generate study artifacts like flashcards and quizzes from what you explained. Longer term, proper consent, retention controls, and a delete-my-data flow before anyone uses it for real.

Built With

Share this project:

Updates