Praxio
Inspiration
Learning usually breaks at the exact moment a student says: "I still don't get it."
At that point, most tools give more explanations: more text, more videos, more lectures. But real understanding often comes from interaction and discovery, not passive consumption.
We wanted to build a tutor that behaves more like a great human mentor: it does not just explain the answer, it creates an experience where the student can figure it out themselves.
What it does
Praxio generates an interactive simulation on demand for a concept the student is stuck on, then guides the student through it with a Socratic voice tutor.
- Student enters a concept (voice or text)
- System generates a custom simulation (physics, math, biology, etc.)
- AI tutor stages the scene (lock/highlight controls, set parameters, annotate key regions)
- Tutor asks questions instead of giving answers
- Student predicts, manipulates, and tests ideas live
- Tutor responds to what the student does in the simulation, not just what they type
The key behavior: the AI never directly "dumps the explanation."
It guides discovery through interaction.
How we built it
- Frontend + API: Next.js App Router (single project, no separate backend service)
- AI orchestration: Vercel AI SDK with
@ai-sdk/google - Model roles:
gemma-4-31b-itfor simulation generation pipelinegemini-2.5-flashfor fast tutor turns
- Tutor architecture: two-call pattern per turn
- Tool-call pass to stage simulation actions
- Speech-only pass to stream the Socratic prompt
- Voice: ElevenLabs (Scribe STT + streaming TTS)
- Persistence: MongoDB Atlas for workspaces, branches/checkpoints, and interaction events
- Simulation runtime: sandboxed iframe + runtime SDK with renderer support (p5.js / JSXGraph / Matter.js / canvas2d)
Most importantly, we added a truth-grounding verification loop: generated simulations are tested against deterministic probes and invariants before being trusted as teaching instruments.
Challenges we ran into
Reliability of mixed tool calls + speech
Asking models to interleave tool execution and natural language in one response was unreliable.
We solved this with a strict two-call turn: stage first, then speak.ID/schema drift in generated tool calls
Prompting alone was not enough for exact control IDs.
We moved to dynamically generated strict enums from the runtime manifest."Looks right" vs "is right"
A simulation can render beautifully and still teach wrong physics/math.
We built behavioral verification with invariants/probes and retry logic before delivery.Keeping latency low for tutoring
Generation can tolerate slower calls; conversational tutoring cannot.
We split model responsibilities by latency profile.
Accomplishments we are proud of
- Built a working system where the tutor has native function-level control over a live simulation
- Implemented a reusable Socratic interaction loop grounded in student actions/events
- Added behavioral verification, not just syntax/static checks, for generated educational simulations
- Integrated end-to-end voice flow (student speech in, tutor speech out)
- Shipped a multi-renderer simulation runtime that stays sandboxed and controllable
What we learned
- In educational AI, correctness is not optional; bad interactivity is worse than no interactivity.
- Tool-enabled tutoring needs deterministic orchestration patterns, not one giant prompt.
- The biggest jump in learning experience came from requiring students to predict first and then confront outcomes.
- Separating model roles by job type (generation vs live tutoring) dramatically improves UX.
What's next for Praxio
- Expand concept coverage and domain-specific invariant libraries
- Add stronger progress signals (mastery maps + misconception tracking)
- Build teacher-facing controls and classroom workflows
- Improve adaptive tutor strategy (question difficulty and pacing)

Log in or sign up for Devpost to join the conversation.