Praxio

Inspiration

Learning usually breaks at the exact moment a student says: "I still don't get it."
At that point, most tools give more explanations: more text, more videos, more lectures. But real understanding often comes from interaction and discovery, not passive consumption.

We wanted to build a tutor that behaves more like a great human mentor: it does not just explain the answer, it creates an experience where the student can figure it out themselves.

What it does

Praxio generates an interactive simulation on demand for a concept the student is stuck on, then guides the student through it with a Socratic voice tutor.

  • Student enters a concept (voice or text)
  • System generates a custom simulation (physics, math, biology, etc.)
  • AI tutor stages the scene (lock/highlight controls, set parameters, annotate key regions)
  • Tutor asks questions instead of giving answers
  • Student predicts, manipulates, and tests ideas live
  • Tutor responds to what the student does in the simulation, not just what they type

The key behavior: the AI never directly "dumps the explanation."
It guides discovery through interaction.

How we built it

  • Frontend + API: Next.js App Router (single project, no separate backend service)
  • AI orchestration: Vercel AI SDK with @ai-sdk/google
  • Model roles:
    • gemma-4-31b-it for simulation generation pipeline
    • gemini-2.5-flash for fast tutor turns
  • Tutor architecture: two-call pattern per turn
    1. Tool-call pass to stage simulation actions
    2. Speech-only pass to stream the Socratic prompt
  • Voice: ElevenLabs (Scribe STT + streaming TTS)
  • Persistence: MongoDB Atlas for workspaces, branches/checkpoints, and interaction events
  • Simulation runtime: sandboxed iframe + runtime SDK with renderer support (p5.js / JSXGraph / Matter.js / canvas2d)

Most importantly, we added a truth-grounding verification loop: generated simulations are tested against deterministic probes and invariants before being trusted as teaching instruments.

Challenges we ran into

  1. Reliability of mixed tool calls + speech
    Asking models to interleave tool execution and natural language in one response was unreliable.
    We solved this with a strict two-call turn: stage first, then speak.

  2. ID/schema drift in generated tool calls
    Prompting alone was not enough for exact control IDs.
    We moved to dynamically generated strict enums from the runtime manifest.

  3. "Looks right" vs "is right"
    A simulation can render beautifully and still teach wrong physics/math.
    We built behavioral verification with invariants/probes and retry logic before delivery.

  4. Keeping latency low for tutoring
    Generation can tolerate slower calls; conversational tutoring cannot.
    We split model responsibilities by latency profile.

Accomplishments we are proud of

  • Built a working system where the tutor has native function-level control over a live simulation
  • Implemented a reusable Socratic interaction loop grounded in student actions/events
  • Added behavioral verification, not just syntax/static checks, for generated educational simulations
  • Integrated end-to-end voice flow (student speech in, tutor speech out)
  • Shipped a multi-renderer simulation runtime that stays sandboxed and controllable

What we learned

  • In educational AI, correctness is not optional; bad interactivity is worse than no interactivity.
  • Tool-enabled tutoring needs deterministic orchestration patterns, not one giant prompt.
  • The biggest jump in learning experience came from requiring students to predict first and then confront outcomes.
  • Separating model roles by job type (generation vs live tutoring) dramatically improves UX.

What's next for Praxio

  • Expand concept coverage and domain-specific invariant libraries
  • Add stronger progress signals (mastery maps + misconception tracking)
  • Build teacher-facing controls and classroom workflows
  • Improve adaptive tutor strategy (question difficulty and pacing)

Built With

Share this project:

Updates