Inspiration
As a Senior Engineer, I’m often the first person people turn to when they want to learn how to code. I’ve noticed a consistent pattern: learners get most discouraged at the exact moment they get stuck, and a human mentor isn't always available to unblock them. Even with the best online resources, the experience is fragmented across lessons, editors, terminals, and chat tools, making the learning process slower and more frustrating than it needs to be.
At the same time, the industry is under immense pressure to close skills gaps faster. This pushed me to think about how learning support could become more contextual and immediate. Agent Tutor was born from that idea: a unified workspace where the curriculum, the code, the runtime feedback, and the tutor exist in a single loop, giving learners help when they actually need it.
What it does
Agent Tutor is a Python learning workspace featuring a real-time Gemini Live tutor built directly into the coding environment.
Instead of forcing learners to jump between a lesson page, an IDE, a terminal, and an external chatbot, Agent Tutor integrates the entire journey. Learners can open a lesson, edit code, run it, and inspect terminal output all in one place.
What sets it apart is that the tutor is grounded in the learner’s live state. It doesn't just respond to a prompt in isolation; it understands the current lesson content, the code in the editor, the latest runtime failure, and a screenshot of the visible workspace to guide the learner like a real-time human mentor.
How I built it
I architected Agent Tutor as a suite of focused, decoupled services:
- apps/web: The learner-facing interface built with Next.js, providing the Monaco editor, terminal, and integrated tutor experience.
- apps/api: A high-performance gateway running on Cloudflare Workers with Hono. It handles orchestration, workspace state, and the minting of short-lived signed tokens for secure tutor connections.
- apps/runner-code-executor: A dedicated execution service on Google Cloud Run that provisions fresh, ephemeral Python environments for every run, returning stdout, stderr, and exit codes.
- apps/agent-tutor-live: A real-time service on Google Cloud Run utilizing the Google GenAI SDK to power the Gemini Live session.
I also developed a custom grounding layer that structures lesson data and workspace telemetry, ensuring the tutor’s advice is specific to the learner's current problem rather than generic.
Challenges I ran into
One of the biggest hurdles was the execution infrastructure. I initially explored a different sandbox path, but platform constraints required us to redesign the execution layer into a dedicated service on Google Cloud Run. This forced us to iterate quickly on how I stream execution data back to the UI, ultimately leading to a more robust architecture.
I also faced performance constraints at the edge. Our initial production sign-in flow hit Cloudflare Worker CPU limits because the standard password verification was too computationally expensive for a serverless edge environment. I had to specifically re-engineer our auth strategy to be "Worker-safe" without compromising security.
Finally, I spent significant effort on contextual grounding. It is easy to bolt a chatbot onto a UI, but much harder to make that assistant truly understand why a specific command failed within the context of a specific lesson.
Accomplishments that I am proud of
I are proud that Agent Tutor feels like a coherent, AI-native product rather than a collection of disconnected demos. I successfully built a workspace where the feedback loop—running code, seeing a failure, and getting a multimodal explanation—is nearly instantaneous.
On the engineering side, I am proud of our hybrid architecture. Leveraging Cloudflare Workers for edge speed and Google Cloud Run for secure, isolated code execution and AI processing allowed us to build a responsive system that doesn't compromise on security.
What I learned
I learned that AI in education is only as good as its context. When a model is deeply grounded in the learner’s real-time task, it shifts from being a "search engine replacement" to a genuine cognitive partner.
I also learned the importance of respecting platform boundaries. Real-time tutoring, browser security, and ephemeral execution each have unique behaviors under load. The right architecture didn't come from forcing one tool to do everything, but from choosing the right runtime for each specific challenge.
What's next for Agent Tutor
The long-term goal is to make Agent Tutor a fully autonomous, integrated learning environment. Our roadmap includes:
- Generative Curricula: Personalized course generation based on a learner’s specific goals and past mistakes.
- Deeper Memory: Giving the tutor long-term memory across sessions to track growth.
- Multi-language Support: Expanding execution runtimes beyond Python to include JavaScript and TypeScript.
- Educator Tools: A workflow for teachers to provision workspaces and review learner recovery patterns.
Built With
- cloudflare
- gcp
- gemini
- hono
- nextjs
- node.js
- shadcn
- zod
- zustand
Log in or sign up for Devpost to join the conversation.