Inspiration

Murder mystery party kits have one fatal flaw: they're scripted. Every group plays the same story. The killer is whoever bought the box. Once you've played a kit, you've played it forever.Meanwhile, social deduction games like Werewolf, Among Us, and Mafia have proven that the most fun party games are ones where one of you is lying — but they're all genre-constrained. Werewolf is werewolves. Among Us is space crewmates. Nobody has made a narrative social deduction game where the mystery itself is different every time.We wanted Whodunit to fill that gap: an AI-generated murder mystery party game where every game produces a unique case (different victim, killer, motive, weapon, suspects, setting), one player is secretly working against the group, and the entire thing plays in 10-15 minutes on phones. Infinitely replayable. No scripts. No spoilers. Just six suspects, one liar, and one body.

What it does

Whodunit is a real-time multiplayer murder mystery party game for 4-6 players.A host opens the app, picks a setting (Victorian Manor, 1920s Speakeasy, or Space Station), and the system generates a complete unique mystery using Gemini: victim, six suspects, one killer, motive, means, and a hidden evidence trail. Other players join with a 4-character code.When the game starts, every player is privately assigned a role:

Most are Detectives — they need to find the killer. One is secretly a Saboteur — they're told the killer's identity from the start. They win if no detective accuses correctly. The game runs in five interrogation rounds. Each round, every player privately interrogates one AI-voiced suspect — a real chat conversation, with the AI playing that character. Innocent suspects answer mostly truthfully but include red herrings. The killer lies systematically about their alibi, their motive, their connection to the victim — and never confesses, no matter how hard you press.Players pin specific suspect statements to their private Evidence Board. They can also broadcast a full interrogation transcript to a Shared Feed visible to everyone. This is how detectives collaborate — and how the Saboteur weaponizes their hidden knowledge by sharing misleading evidence.After five rounds, the Accusation Phase begins. Players accuse a suspect in turn, with up to two attempts each. The first correct accusation wins. If no one accuses correctly, the Saboteur wins.Then comes the Reveal: a cinematic four-screen sequence showing the killer, the motive, the means, and the key piece of evidence that would have given it away. Finally, the Scoreboard reveals the Saboteur's identity to everyone.

How we built it

We built Whodunit entirely on MeDo with no hand-coded code. Vite + React + TypeScript + Tailwind on the frontend, Supabase for backend (Postgres + Realtime + Edge Functions), Gemini 2.5 Flash for all LLM calls — every line of which MeDo generated from PRDs. The slice-by-slice methodology We didn't try to describe the whole game in one prompt. Instead, we wrote six incremental PRDs, each focused on adding one vertical slice of functionality:

Slice 1 — Lobby & Roles: Multiplayer plumbing. Join codes, real-time lobby joins, random secret role assignment. Slice 2 — Mystery Generation: Setting picker, AI-generated mysteries, briefing flow. Slice 3 — Interrogations: Private AI-voiced chats with each suspect. Slice 4 — Evidence & Rounds: Per-player evidence boards, screenshot broadcasting, 5-round timer. Slice 5 — Accusations & Reveal: Turn-based accusations, cinematic reveal sequence, scoreboard. Slice 6 — Visual Identity: Complete aesthetic overhaul into a noir / detective-novel look.

Each slice was its own PRD pasted into MeDo. MeDo reformulated each into a structured spec, generated working code, and provided everything — frontend, edge functions, database migrations, RLS policies. When something broke, we sent targeted follow-up prompts and MeDo fixed those bugs without regenerating from scratch. The most critical first step: derisking before designing Before committing to the design, we tested whether MeDo's stack could even do what we needed. We had MeDo build a throwaway two-user "secret word" app: User A types a word, User B sees "A has submitted" without seeing the word, then taps Reveal to see it. That experiment proved MeDo could handle per-user views, real-time state sync, and basic privacy enforcement. Only then did we commit to building Whodunit. That single de-risking experiment cost us ~90 credits and saved us from building the whole game on architecturally unsound foundations. Mystery generation: solution-first, top-down The biggest risk in AI-generated mysteries is plot holes: the killer's alibi accidentally matching the evidence, motives that don't fit, "clues" that don't actually point anywhere. We solved this by instructing Gemini to generate the mystery solution-first:

First the victim. Then the killer's motive and means. Then the killer suspect (with a deliberately false but plausible alibi). Then the five innocent suspects, each with their own alibi and red herrings. Finally the key evidence — the specific clue that, if discovered, points back to the killer.

This generation order guarantees the mystery is internally consistent. The killer's lies are anchored to a real solution that exists in the database. Interrogation AI: distinct voices, consistent lies Every player message in an interrogation triggers a Gemini call with the suspect's full character data, the mystery solution context, the conversation history, and detailed behavioral rules. The result is genuinely surprising:

Each suspect maintains a distinct voice — a Victorian governess speaks nothing like a 1920s bouncer or a 22nd-century space station security chief. The killer lies consistently across messages and across players. Same false alibi to everyone. Innocents answer truthfully but stay bounded — they don't know who the actual killer is, only their own perspective. Suspects refuse to break character — including refusing to admit they're AI when directly asked.

A real example from a generated Victorian game:

Player: Where were you when the murder happened? Dr. Eleanor Vance: I was in the infirmary, tending to Mrs. Davies. She was suffering from a truly dreadful migraine, and I administered a sedative. I stayed with her until she finally drifted off, around half past midnight. Player: Can anyone confirm you were there? Dr. Eleanor Vance: No, she wouldn't be able to. As I said, I administered a sedative, and she was already deeply asleep by midnight.

Dr. Vance was the killer. Notice how she preemptively defends against the follow-up question — the sedative wasn't just flavor; it explains why her alibi can't be corroborated. That's emergent from the prompt structure, not scripted. Privacy: three layers of enforcement The mystery's solution — killer identity, motive, key evidence, suspect alibis — is the most sensitive data in the app. If a clever player inspects the network tab, the game is broken. We enforced privacy at three independent layers:

Database-level Row Level Security policies (USING (false)) block direct client reads of the mysteries and suspects tables. Even with a valid Supabase anon key, no client can run SELECT * FROM mysteries. Edge functions explicitly select only public fields before returning data. The killer flag, motive, means, and key_evidence are loaded into the function's memory for prompt construction but never serialized to the client. The Saboteur's killer info is gated behind a role-verified server-side check. A Detective's client can never receive killer information under any circumstance, no matter what request they craft.

We verified all three layers manually using browser dev tools throughout development.

Challenges we ran into

The multiplayer feasibility question. Before MeDo, we genuinely didn't know whether the platform could handle 4-6 simultaneous users on shared state with per-user privacy. We had to test it first or risk wasting the whole project on something architecturally impossible. Mystery plot consistency. Our first attempt at mystery generation produced plot holes — clues that didn't actually point to the killer. Switching to solution-first generation fixed it. The killer's lying behavior. It took prompt engineering to make Gemini lie systematically and consistently instead of either confessing instantly or being so evasive it broke the game. The key was distinguishing between core facts (alibi, motive — lie about these) and peripheral details (occupation, general feelings — be truthful about these). Real-time round timer. Server-authoritative timers across multiple clients are harder than client-side ones. MeDo's first implementation had a flickering Evidence tab from a misconfigured Supabase Realtime subscription. We fixed it with one targeted iteration message. Visual identity. Our first five slices produced a working game with a generic dark-mode SaaS look. It read as "AI-generated React" rather than "a real game." Slice 6 was a full visual identity pass — noir / detective-novel aesthetic, serif typography system, parchment textures, card-flip role reveal, cinematic reveal sequence.

Accomplishments that we're proud of

A complete, playable, multiplayer game built end-to-end on MeDo with zero hand-written code. Frontend, backend, database, AI integration, real-time sync — all generated. AI suspects with genuine personality. Reading a Victorian doctor preemptively defend a fabricated alibi, or a 1920s nightclub bouncer deflect questions in period-appropriate slang, is the moment where Whodunit feels less like a tech demo and more like a real game. Three layers of privacy that actually hold up. A Detective can dig through dev tools all they want — they cannot find the killer's identity. Replayability. Every Whodunit game is genuinely different. Same setting, fresh victim, fresh suspects, fresh motive, fresh weapon. No two playthroughs are the same.

What we learned

PRDs are leverage. Spending an hour writing a precise spec saves five hours of build iterations. The clearer the entity model and behavioral rules in the PRD, the more useful MeDo's first generation is. Vertical slices beat horizontal ones. Building "the entire data model" then "the entire backend" then "the entire frontend" is a recipe for never having a working app. Building thin end-to-end slices means you always have something demoable and always know what's broken. AI hallucinations have rules. With the right prompt structure, an LLM can be made to reliably lie about specific things while telling the truth about others. This is gameable, and it's the foundation of social deduction.

What's next for Whodunit

More settings (we shipped three — Victorian Manor, 1920s Speakeasy, Space Station — but the engine supports infinite variation) AI-generated suspect portraits per game using image generation Voice acting for suspects via TTS (so you literally hear the lies) Sound design and ambient audio per setting Custom mystery generation — let players seed their own setting prompts

Built With

Share this project:

Updates