Inspiration
Both of us cracked JEE recently, so the pain is still fresh. You're stuck on a rotational mechanics problem at 11 PM, your coaching doubt-counter is closed, and the toppers' group chat is silent. You paste the question into a generic AI chatbot and it confidently gives you a wrong formula — and as a student, you can't even tell it's wrong.
That's the real problem: it's not that AI can't answer JEE questions, it's that students can't trust the answers. We wanted to build an agent that doesn't just answer — it proves its answer is grounded in real, verified JEE material.
What it does
A student asks any JEE doubt — by typing it, uploading a photo of the question, or speaking it out loud. Then:
The agent converts the query into a 768-dimensional embedding and runs a semantic vector search on our MongoDB Atlas database of verified Previous Year Questions (PYQs) with step-by-step solutions. If a strong match is found, the verified solution grounds Gemini's response — and the student sees a confidence score (e.g., 🎯 Database match: 87%), so they know exactly how much to trust it. If no match exists, Gemini 2.5 Flash solves it live, step-by-step, with clean LaTeX-rendered math like:
$$ H_{max} = \frac{u^2 \sin^2\theta}{2g} $$
The agent replies in the student's own language — English, Hindi, or Hinglish — because doubts hit harder in your mother tongue.
How we built it
Agent Brain (agent_brain.py) — Google Gemini 2.5 Flash with multimodal input (text, image OCR, audio). Handles embedding generation, grounding logic, and language matching. Knowledge Layer — MongoDB Atlas Vector Search over a curated PYQ collection, with cosine similarity on 768-dim embeddings and a confidence threshold to decide "grounded answer" vs "live solve". Gateway (main.py) — a FastAPI backend with request validation and in-memory caching, so repeated doubts (and JEE doubts repeat a lot) come back instantly and cheaply. Standards (mcp_server.py) — we exposed the solver as an official MCP (Model Context Protocol) server, so any MCP-compatible client can use our agent as a tool. MIT licensed and open source. Deployment — frontend + backend deployed live on cloud, tested end-to-end before submission.
Challenges we ran into
Challenges we ran into
Embedding dimension mismatch. Our vector index expected 768 dimensions but the embedding model returned a larger vector. Debugging this taught us how vector indexes actually work under the hood — we fixed it by slicing embeddings to a consistent 768 dims on both the ingestion and query side. Lesson: your write path and read path must speak the exact same embedding language. Model overload (503) errors. Mid-hackathon, Gemini started throwing 503 UNAVAILABLE during peak traffic. Instead of panicking, we learned to read API error semantics properly — 503 is transient, so graceful handling and retries beat blindly switching models. (We also discovered the hard way how fast model versions get deprecated — always check the model lifecycle page!) Grounding vs hallucination trade-off. Setting the confidence threshold was tricky: too low and wrong PYQs pollute the answer, too high and the database never gets used. We tuned it by testing real JEE queries against the index. The clock. Two first-year students, one Rapid Hackathon, and a hard deadline. We learned to ruthlessly prioritize: working > perfect.
Accomplishments that we're proud of
- We actually shipped. A fully deployed, live, working agent — frontend + backend + database — built by two first-year students in a rapid hackathon. Judges can click the link and use it right now.
- Zero-hallucination architecture. Our answers aren't just AI guesses — they're grounded in a verified PYQ database with a visible confidence score. We built trust into the product, not just accuracy.
- True multimodal input. Type your doubt, upload a photo of the question, or just speak it — one agent handles all three through Gemini.
- Real vector search in production. We didn't mock it — we designed the MongoDB Atlas vector index, built the embedding pipeline, hit a 768-dimension mismatch bug, debugged it, and fixed it properly.
- MCP server implementation. Our solver isn't a closed app — it's exposed as an official Model Context Protocol server, so any MCP-compatible client can plug into it. Open source, MIT licensed.
- It speaks the student's language. Hindi, Hinglish, English — doubts get answered the way students actually think.
- Grace under fire. When Gemini threw 503 errors a day before the deadline, we diagnosed the root cause instead of panic-rewriting working code.
What we learned
RAG is an architecture, not a buzzword — retrieval, grounding, confidence scoring, and fallback each need deliberate design. How MongoDB Atlas Vector Search works end-to-end: index definition, embedding pipelines, similarity scoring. Building multimodal agents — handling text, images, and audio through one unified Gemini interface. What the MCP standard is and why exposing agents as interoperable tools matters for the ecosystem. And the meta-lesson: shipping a real, deployed product in days is mostly about scope discipline.
What's next for JEE Solver Agent
Conversational memory via MongoDB — the agent remembers each student's doubts and weak topics across sessions, becoming a true personal tutor. Bigger PYQ database — more years, more chapters, more verified solutions. NEET and board exams — same architecture, new knowledge bases. Student progress tracking — analytics on weak areas, suggested practice.
Our vision: a personal AI tutor for every student in India — one that never sleeps, never judges, and never hallucinates a formula.
Built With
- fastapi
- gemini-api
- google-cloud
- mcp
- mongodb
- mongodb-atlas
- pil
- pydantic
- pymongo
- python
- python-dotenv
- uvicorn
- vector-search
Log in or sign up for Devpost to join the conversation.