Inspiration

We believe active recall and guided discovery are far more effective than passive reading. We were inspired by the Socratic method to build an AI tutor that doesn't just give answers but asks the right questions.

What it does

Agora is a voice-first, multimodal AI tutor that you study with. You upload your notes, PDFs, or even photos of your lecture scribbles, and then talk to the tutor via a push-to-talk interface. It guides you through complex topics with Socratic questions, live analogies, and visual explanations on an infinite blackboard, all while tracking your "confused" topics to generate quizzes later.

How we built it

Agora's backend is a stateful LangGraph agent (run on FastAPI with Python) that orchestrates Gemini 2.5 Pro for reasoning and multimodal vision. We use Qdrant as a vector database for RAG on uploaded documents and to store the student's long-term "memory". The frontend is a Next.js app using Tldraw for the interactive blackboard, all connected in real-time via Socket.IO for streaming voice and visual commands.

Challenges we ran into

Integrating the stateful LangGraph agent with a real-time WebSocket connection was our biggest challenge, leading to message loops until we correctly managed listeners in our React hook. We also had to debug tricky API mismatches in Tldraw (like .clear() vs. .deleteShapes()) and fine-tune our RAG node's logic to prevent it from retrieving old documents in response to simple greetings like "hi".

Accomplishments that we're proud of

We are incredibly proud of achieving a full, end-to-end multimodal loop. The system successfully takes a user's voice, transcribes it, routes it through the AI agent, retrieves context from a vector DB, and responds with both synthesized speech and visual actions on the Tldraw canvas. Seeing the AI draw a note on the blackboard that it "thought" of, based on our prompt, was the magic moment.

What we learned

We learned that a powerful model like Gemini 2.5 Pro is only as good as the robust system you build around it; our Socratic system prompt had to be heavily revised with guardrails to handle "idk" and bad RAG retrievals. We also learned that state management is the most critical piece of a complex agent, which made LangGraph the perfect tool. Finally, we say that RAG (like ingesting handwritten notes from a photo) is not just a demo - it's practical and powerful.

What's next for Agora Socrates

The next step is to make the tutor proactive. We plan to implement a spaced repetition feature that uses the AI's memory of "confused topics" to generate review quizzes at the start of a new session. We also want to implement "self-explanation scoring," where the AI asks the user to explain a concept in their own words and then scores their explanation against the original document context. (would have been cool to have finished these)

Built With

Share this project:

Updates