Inspiration

Every hackathon ends the same way: a handful of projects land in the top tier and most don't, often with surprisingly little correlation to how hard the team worked. I wanted to know what the winners were actually doing differently — not in vibes, but in the data.

So I pulled 400 projects across two hackathons (Gemini and Nova, 200 each), all scored by the same four judge personas on the same 1–10 rubric. Only ~14% landed in the high-alignment tier. The question wasn't "what makes a good idea" — it was "what makes four different people agree it's a good idea."

I wrote up the full breakdown here: https://michiyamamoto.substack.com/p/the-sacrifice-equation-why-86-of

What it does

The Sacrifice Equation is a calibrated judge-simulator. You paste a project title, problem statement, and proposed solution. It runs your idea through the same four judge personas — Systems Engineer, Ambitious Founder, Busy End User, Pragmatic Operator — and returns:

  • Predicted Alignment Score (consensus strength) and Divergence Score (judge disagreement)
  • Per-judge rationales showing exactly why each persona scores you up or down
  • Core tension detected in your idea (Usability vs Complexity, Ambition vs Feasibility, Demo vs Production, etc.)
  • Three most similar historical projects from the 400-project reference set, with their actual outcomes

The point is to hear all four voices before you commit a weekend to building.

How we built it

The project lives entirely on Zerve. Phase 1 was forensic data analysis — calibration checks, tension landscape mapping, and reverse-engineering the Divergence Score formula (turns out it isn't max-minus-min; max formula match rate was only 16.8%). Phase 2 produced replication charts across both hackathons. Phase 3 built and validated the evaluator. The frontend is a Streamlit app deployed via Zerve's one-click deploy to sacrifice-equation.hub.zerve.cloud.

Challenges we ran into

The biggest surprise was that the dataset's own scoring metrics didn't behave the way they appeared to. The Alignment Score wasn't a simple mean (only 23.2% match rate against round(mean(judges))), and the Divergence Score resisted every obvious formula I tested. I had to stop trying to explain the score and instead look at patterns in how projects clustered — which is what unlocked the four failure archetypes.

The second challenge was avoiding a flattering tool. It would have been easy to build something that tells founders their idea is great. The judge personas are deliberately tuned to disagree, because that disagreement is the signal.

Accomplishments that we're proud of

  • Identified four named failure archetypes (Universal Rejection, Polarized Reception, Overreached Ambition, Demo Trap) covering 272 of 400 projects
  • Found the bottleneck judge: Systems Engineer scores lowest on 40–52% of projects, single-handedly capping consensus
  • Replicated the "Usability vs Complexity wins" finding cleanly across two independent hackathons
  • Shipped an actual deployed app, not just an analysis — you can use it right now

What we learned

Ambition doesn't lose points. Unfocused ambition does. The winning projects didn't avoid hard problems — they cut scope on purpose so the remaining work was credible. And consensus is fragile: three judges loving your idea can't save you from one judge giving you a 1.

The other lesson was about Zerve itself. Going from raw CSV to deployed Streamlit app in one environment, with the analysis notebook still live next to the deployment, changed how I worked. I stopped context-switching between "analysis mode" and "shipping mode."

What's next for Sacrifice Equation

  • Expand the reference set beyond 400 projects to cover more hackathon formats and judge profiles
  • Add a "fix-it" mode that suggests scope cuts to convert polarized ideas into consensus ideas
  • Open the API endpoint so other hackathon organizers can calibrate their own judge panels
  • Run it on the projects from this hackathon and see how well it predicts the outcome

Built With

Share this project:

Updates