Side Bet — A social sportsbook for homework

Inspiration

One in three U.S. college students drops out, and the largest single driver isn't tuition — it's disengagement. Study after study points to the same thing: the students who quit aren't the ones who can't do the work, they're the ones who stopped showing up to it. Meanwhile, the same demographic spends hours a day on apps engineered around the mechanic that does keep humans coming back: a live, social, peer-visible scoreboard.

We kept asking: what if the loop that DraftKings uses to make people care about an NFL game on a Tuesday night could be pointed at the thing that actually decides whether you graduate — finishing your problem sets?

So we built Side Bet: a private, group-scoped prediction market where friends bet small (cosmetic) dollars on each other's coursework. "Will Justin submit Lab 4 by Friday?" "Most assignments completed this week — Ezzy or George?" An AI bookmaker prices every market against real submission history; an early-warning agent watches the betting record itself for the signature of a student who's slipping.

The pitch is dropout prevention via commitment contracts. The product is a sportsbook.

What we learned

A hackathon team of three with a sharp design doc can ship more than a hackathon team of five without one. We spent the first two hours not coding — we spent them writing a merge contract (docs/00-architecture.md) that nailed every table name, server-action signature, API route shape, and realtime channel before anyone touched a keyboard. That contract was the single most valuable artifact of the build. With it, three lanes (Platform / Agent / UI) ran in parallel for sixteen hours and the integration at hour 18 actually merged.

Some specific things we learned in the trenches:

  • LMSR is beautiful but numerically unforgiving. The naive cost function $C(q_y, q_n) = b\,\ln(e^{q_y/b} + e^{q_n/b})$ overflows as soon as a market has any meaningful volume. The log-sum-exp trick — factoring out $\max(q_y/b,\, q_n/b)$ — is non-optional, not nice-to-have.
  • Tool calling reliability is per-model, not per-vendor. Llama 3.3 70B on NVIDIA NIM is excellent at picking the right formula but occasionally returns mixed prose-plus-JSON, so we shipped a extractJson that scans for the last balanced {…} block instead of trusting JSON.parse(content).
  • Atomicity in Postgres is surprisingly easy if you give in to plpgsql. A single SELECT … FOR UPDATE on the markets row was the difference between a working market and a market whose share counts diverged from its own price under concurrent trades.
  • "USD" doesn't need Stripe. A column rename — credits int → balance_cents int — plus one formatUsd helper plus a single rounding rule at the LMSR boundary turns play credits into a believable wallet.

How we built it

Stack at a glance

Layer Choice
Frontend Next.js 16 (App Router), shadcn/ui, Tailwind v4, Recharts
Backend Next.js server actions + API routes, deployed to Vercel
DB / Auth / Realtime Supabase (Postgres + RLS + realtime channels)
AI NVIDIA NIM via the OpenAI-compatible client — meta/llama-3.3-70b-instruct, falling back to nvidia/llama-3.1-nemotron-70b-instruct
Real Canvas integration Browserbase + Playwright-CDP (cookie-capture SSO, no credentials stored)
Email Resend (group invites + magic links)
Tests Vitest
Package manager Bun

The market mechanism: LMSR

Every Side Bet market is a binary YES/NO Logarithmic Market Scoring Rule market — the same primitive used for head-to-head bets, daily props, and vote-resolved open-ended bets. Only resolution_criteria differs.

The cost function:

$$C(q_y, q_n) = b \cdot \ln!\left(e^{q_y/b} + e^{q_n/b}\right)$$

The instantaneous YES price (which is also the market's implied probability):

$$p_\text{yes} = \frac{e^{q_y/b}}{e^{q_y/b} + e^{q_n/b}}$$

The cost to buy $\Delta$ YES shares:

$$\text{cost}(\Delta) = C(q_y + \Delta,\, q_n) - C(q_y,\, q_n)$$

When the AI bookmaker decides a market should open at probability $p_0$, we seed share counts so the price starts there:

$$q_y - q_n = b \cdot \ln!\frac{p_0}{1 - p_0}$$

The liquidity parameter $b$ controls market depth — we use $b = 50$ for daily props, $100$ for weekly head-to-heads, $200$ for season-long markets. All the math lives in lib/lmsr/index.ts as a single source of truth that both the trade engine (Person A) and the agent toolkit (Person B) import.

The AI bookmaker — formulas as tools

We wanted an agent that visibly thinks — not a black-box LLM that emits a number. So we built a deterministic formula toolkit (lib/odds/) and exposed each formula as an OpenAI-compatible tool spec. The model picks formulas; TypeScript does the math.

The toolkit includes, among others:

  • estimate_prob_more_completions — given two students' weekly rates and variances, returns $P(A > B) = \Phi!\left(\dfrac{\mu_A - \mu_B}{\sqrt{\sigma_A^2 + \sigma_B^2}}\right)$
  • estimate_prob_submits_by_deadline — base rate × time-pressure factor
  • seed_lmsr_shares — the inversion above
  • apply_house_edge — pulls a fair probability toward $0.5$ by a small fraction
  • compute_at_risk_score — deterministic severity classifier (none / yellow / red)
  • recommend_spread_handicap — extra completions credited to the weaker subject

The agent loop in lib/ai/agent-loop.ts runs a standard tool-call cycle, but with three demo-grade safeguards: a 4-second wall-clock timeout (Promise.race), an automatic retry on the fallback model, and — if all else fails — a deterministic fallback that returns the same Zod-validated shape in <50 ms, computed entirely from the toolkit. The demo can never hang on an AI call.

Atomic trade execution

The non-negotiable in any prediction market is that two simultaneous trades can't corrupt the share counts. Our execute_lmsr_trade Postgres function does, in one transaction:

  1. SELECT … FOR UPDATE on the markets row (row-level lock)
  2. Reject if the market is closed, locked, or past closes_at
  3. Reject if the user is not in the market's group
  4. Reject if the market is vote-resolved and the caller is the creator (defense in depth — the server action checks this too)
  5. Compute LMSR cost, round up to the nearest cent, deduct from balance_cents
  6. Update market shares + current_price_yes, upsert position, insert trade, insert price-history row
  7. Return the new state

Closing a position rounds down to the nearest cent, so the LMSR pool can never be drained by sub-cent rounding leakage from repeated buy/close cycles.

Real Canvas integration via Browserbase

We started with mock Canvas data, then went further: a working SSO flow that drops the user into a remote Chromium running on Browserbase, lets them sign into UCSC's CAS (CruzID + Gold password + Duo) inside that Chromium, then captures cookies over the Chrome DevTools Protocol once login completes. We never see, store, or proxy the password — only the post-auth cookies. The lifecycle (one long-lived CDP connection cached per session, garbage-collected after 10 minutes) is a real piece of infrastructure that survives Browserbase's free-tier session timeouts.

The pivot: from public credits to private USD groups

Mid-hack we cut the public friendships graph entirely and rebuilt around groups: a user joins a group via 6-character invite code or emailed token (Resend), an "active group" cookie scopes the home page, and every market lives inside exactly one group. Vote-resolved markets — for stuff Canvas can't grade, like "Will Justin pull an all-nighter in the library tonight?" — resolve via a 24-hour group vote after resolves_at, with strict majority winning, ties or zero votes voiding the market and refunding everyone their cost basis.

That pivot, in retrospect, was the move that made the product feel real. A private group of friends betting on each other is a plausible product. A public marketplace of strangers betting on each other's grades is a privacy disaster.

Realtime

Six Supabase channels feed the live UI: markets:open, market:{id}, trades:market:{id}, nudges:user:{id}, market:{id}:votes, group:{id}:markets. Price changes trigger a 400 ms gold flash on the price label; new price_history rows extend the Recharts line with animationDuration: 300 so the chart breathes instead of jumping.

Challenges we faced

  • Concurrent trades, divergent prices. Our first trade implementation read the market state, computed cost, and wrote back — three separate statements. Two simultaneous YES buys both saw the pre-trade state and both wrote back; the pool ended up with too many shares for too little cost. Wrapping everything in a single plpgsql function with FOR UPDATE solved it for good.
  • LMSR overflow. Once a market had $\sim 200$ YES shares with $b=100$, $e^{q/b}$ overflowed. We refactored every cost calculation to use the log-sum-exp form: factor out the max, then add it back outside the log. Without this, the math is nominally correct and operationally wrong.
  • Llama tool-calling quirks. Llama 3.3 70B occasionally returns prose-then-JSON, occasionally calls an unknown tool, occasionally produces invalid JSON in tool args. The agent loop tolerates all three — bad JSON gets a "your previous arguments were invalid, retry" tool message; unknown tools get an unknown_tool error returned as a tool result; mixed output is parsed by scanning for the last balanced {…} block.
  • Vercel cron on the free tier. We needed a tick that locks vote-resolved markets at resolves_at, tallies them at vote_closes_at, settles canvas-grounded markets, and runs the at-risk scan. We protect /api/cron/tick with a CRON_SECRET bearer header and expose a manual curl so the demo can advance time on command.
  • Cents rounding at the LMSR boundary. Floats live everywhere inside lib/lmsr; integer cents live everywhere in the database. The conversion happens once per trade, in opposite directions for buy vs. close, so the house can never bleed sub-cent dust over thousands of trades. Vitest cases cover round-half on both paths.
  • Email scanner token consumption. Magic-link tokens were getting burned by mail-client URL scanners before the user clicked. We split auth into a code step + interstitial, and the OTP guard now accepts 6–10 digit codes to match what Supabase actually sends.
  • Demo flop-prevention. A live demo is unforgiving. We pre-seeded a "Side Bet Demo" group with three personas (Ezzy, George, Justin) and 4 weeks of varied submission history each, added "Sign in as…" demo buttons that bypass magic-link round-trips, layered in a welcome tour, and even shipped a mean-reverting ghost-trader so the chart looks alive when no human is trading. Then we recorded a 90-second screen capture as a backup, on principle.

What's next

If we keep going: real Canvas OAuth across more institutions (the Browserbase prototype is the proof of concept), an opt-in charitable loss pool (losers' cosmetic dollars become a real donation to a campus food bank — pro-social, not predatory), and a syndicate mode where a study group bets collectively against another study group's combined GPA. The framing we keep coming back to: we didn't build a gambling app — we built a dropout-prevention tool with a gambling aesthetic.

Built With

Share this project:

Updates