HackBuddy

Inspiration

We noticed that hackathon teams spend hours building in the dark with zero feedback until judges score them after the deadline. Meanwhile, organizers have no visibility into team progress or whether projects overlap. We wanted to close that feedback loop and give teams a way to practice-judge their work, track competitors in real time, and submit without the tedious copy-paste into Devpost forms.

What it does

HackBuddy is a full-stack hackathon platform with a web dashboard, an MCP server for AI coding assistants, and a multi-agent backend. For participants, it auto-detects your hackathon from a Devpost URL, lets you register a team, link a GitHub repo, and then practice-judge your project against the real rubric with an agentic AI that reads your actual code and cites specific files. It shows a live commit-heatmap timeline of competitor repos updating every 30 seconds, a countdown dashboard with milestone nudges, and a similarity detector that flags overlapping projects. It generates pitch decks, READMEs, and draft submissions, and can auto-fill your entire Devpost submission via browser automation. For organizers, a separate dashboard lets you create hackathons (or import from Devpost), share an MCP config with participants, and monitor team registrations, practice scores, and submissions in real time. There's also a multi-agent chat interface where an orchestrator delegates to organizer, participant, and judge sub-agents. Everything works both through the web UI and as MCP tools callable from Claude Code or Cursor.

How we built it

We built a Next.js 14 app that doubles as a full MCP server implementing JSON-RPC 2.0, so one deployment serves both the web UI and agent-callable tools. The AI judge uses Gemini 2.5 Flash in an agentic tool-calling loop that reads code iteratively (up to 10 tool calls per repo) rather than dumping everything into a single prompt. Real-time competitor tracking runs on Server-Sent Events with a background poller hitting the GitHub API every 30 seconds. Auto-submission uses Puppeteer with a human-in-the-loop design where the agent fills all fields but the user solves the CAPTCHA. Authentication is handled by Clerk with GitHub OAuth so we can auto-detect repos. We built a multi-layer Devpost scraper using JSON-LD, Cheerio HTML parsing, and LLM fallbacks, plus a fallback chain across Gemini, Groq/Llama-3.3-70B, and raw text extraction so the system stays up even when a provider goes down.

Challenges we ran into

Devpost's anti-scraping measures required reverse-engineering the exact combination of session cookies, Referer, and X-Requested-With headers for paginated participant data. GitHub rate limiting with 15+ concurrent repos forced us to build multi-token rotation and cap analysis at 8 repos with lazy file loading. Gemini's safety filters would sometimes block benign code, and Groq's fallback models buried JSON inside chain-of-thought text, so we had to build custom parsers and tune filter thresholds.

Accomplishments that we're proud of

The agentic judge actually reads code like a human would, following imports, checking commits, and citing specific files in its reasoning rather than giving generic feedback. The real-time competitor tracking -- seeing other teams' commit heatmaps update live during a hackathon -- is a genuine competitive edge that didn't exist before. We're also proud that the entire platform works both as a polished web app and as MCP tools inside your IDE, so participants never have to leave their coding environment.

What we learned

Agentic tool-calling loops produce dramatically better judgments than single-shot prompts, and showing the agent's investigation trace makes scores feel trustworthy. We also learned to never trust LLM arithmetic (always recompute weighted totals server-side) and that fallback chains across multiple providers are essential when reliability matters during a live event.

What's next for HackBuddy

We want to add live score diffing so teams can see which commits actually move the needle, webhook-driven updates to replace GitHub polling, and judge calibration where organizers upload past winning projects to tune the AI's scoring distribution.