Inspiration
Most study tools solve the wrong problem. They make it easier to get answers — not easier to actually understand something. We kept noticing that students (ourselves included) could use ChatGPT to produce a correct response and still have no idea what was going on underneath it. The answer looked right. The understanding wasn't there. We wanted to build something that cared about the second part.
What it does
StudyBuddy takes your answer to a question and evaluates the reasoning behind it — not just whether you're right, but how you got there. It identifies specific gaps in your thinking, names the misconceptions it detects, and generates follow-up questions that push you further. Two students can go head-to-head on the same question and see their reasoning compared side by side. The goal isn't to tell you the answer. It's to show you where your thinking breaks down.
How we built it
Frontend is React with TypeScript and Vite. State runs through Zustand, which coordinates the multi-step flow — question in, answer submitted, evaluation triggered, results surfaced. All Claude interactions live in a dedicated service layer so the UI never touches the API directly. We used Supabase for real-time sync between players. The Claude API handles the evaluation, and we spent a lot of time getting it to return clean, structured JSON consistently rather than freeform text.
Challenges we ran into
Prompting Claude to evaluate reasoning depth — not just correctness — was harder than expected. Early versions basically just checked if the answer was right. Getting it to reliably name a specific misconception, explain why the logic fails, and generate a useful follow-up question took a lot of iteration. Managing state across multiple steps cleanly (question → answer → evaluation → feedback) while keeping the UI responsive was the other main friction point. And securing the API properly so keys never hit the client took more architecture thought than a hackathon usually encourages.
Accomplishments that we're proud of
The evaluation pipeline actually works. Claude returns structured feedback that's specific enough to be useful — not just "incorrect" but "you're confusing X with Y, here's why that matters." The real-time multiplayer flow runs cleanly between two players without lag. And the whole thing is split across three parallel modules with clean interfaces between them, which meant three developers could work simultaneously without stepping on each other
What we learned
Prompt design is product design. The difference between Claude returning useful structured feedback and useless freeform text came down entirely to how we framed the ask. We also learned that "evaluating understanding" and "generating answers" are completely different problems — and that most AI integrations in edtech only solve the second one. Building a backend proxy for the API instead of calling it from the client directly was the right call, and we'd do it that way from the start next time.
What's next for StudyBuddy
Longer sessions with adaptive difficulty — the system should get harder as your reasoning improves, not stay static. A personal misconception tracker that builds over time so you can see which concepts you keep getting fuzzy on. Instructor-facing dashboards so teachers can see class-wide reasoning gaps, not just individual scores. And eventually, integrating with course syllabi so the questions are actually tied to what you're studying that week.
Built With
- api
- claude
- firebase
- react
- typescript
- vite
Log in or sign up for Devpost to join the conversation.