Inspiration

Social deduction games (such as Werewolf, Mafia, or Among Us) are very rich environments for AI research. They require theory of mind, deception, trust, and real-time reasoning. We were inspired by the question: can a large language model act as a crewmate (finding information, doing tasks, and deducing who the killer is) and an imposter (seeking out crewmates, sabotaging, killing, and lying)? We wanted to push LLMs into a domain of partial information, social manipulation, and group dynamics. Bluffing, creating alibis, logging down information, and convincing others are the key to winning this game.

What it does

MASDE is a fully simulated Among Us environment in Unity Game Engine, where every player (both crewmate and impostor) is controlled by an LLM agent. Crewmates navigate the map, complete assigned tasks (common, short, and long), report dead bodies, call emergency meetings, and vote to eject players whom they think are impostors. Impostors need to kill and sabotage objectives, use the secret vent network to traverse the map covertly, and must blend in socially. Each agent receives a structured JSON state packet describing their location, nearby agents, proximity events, sabotage status, available actions, and more, and responds with a chosen action. Events like task completions, kills, and sabotage resolutions are batched and dispatched over TCP to a Python LLM backend, which pauses the simulation while reasoning and resumes it after acting. This little simulation acts as a desk pet you can let sit in the corner and watch (it's pretty interesting!).

How we built it

Our stack is split between a Unity game simulation layer and a Python LLM orchestration layer connected over TCP sockets. In our Unity (C#): Core game logic is distributed across Amongi.cs (yes, we made up the name that each agent is called an Amongi), such as agent state, kills, vents, and sabotage reactions. AgentPathFollower.cs (waypoint-graph pathfinding with A*/BFS), TaskManager.cs (randomized task pool assignment), LocationManager.cs (room tracking and proximity detection), and SabotageManager.cs. Event System (Connection.cs): When notable events fire (kill completed, player spotted, task done), sendInformation() builds a JSON envelope and queues it. A batch window accumulates all near-simultaneous events into a single packet, then we freeze the simulation until the LLM responds. Our Python Backend: Receives batched JSON, routes each agent's observations to the appropriate LLM call, and returns a structured action response that Unity deserializes and dispatches. The way our observation design is like so: each state packet includes the agent's location, agents sharing their room (with an electrical blackout exception for crewmates), proximity events, available actions, and role-specific context, and a lot more.

Challenges we ran into

One challenge we ran into was that at 8 pm on 2/21/26, we accidentally deleted our entire environment codebase. This took a lot of effort to rebuild, but by some miracle, we were able to code everything back in again. Another challenge was creating our voting and chatting system. We did so by storing all of the information into our LLM backend, and only sending texts to our Unity front end, on who was voted out or not.

Accomplishments that we're proud of

We literally created the popular video game Among Us within 12 hours because our entire codebase got deleted. That's gotta be cool, no? On top of this, we achieved fully autonomous agent interactions. LLM Python backend and Unity front-end integration is something very different than what most frameworks/apps use as well. Also, the entire system is easily scalable. You can add basically infinite agents, and it'll work!

What we learned

Integrating via socket and json packets is very hard without clear communication. Knowing what packets to send and what exact structure was a challenge. On top of this, partial information design is not easy. Deciding what each agent should and shouldn't know at the time and enforcing this in our data is very important. Social deduction requires more than just state observations.

What's next for Multi-Agent Social Deduction Engine (MASDE)

For this project, something that would be interesting is pitting different LLMs (ie, GPT, Claude, Gemini, etc.) against each other, and making each agent have its own model. We could also add in a win condition analytic dashboard, and for people who actually want this as a desk pet, dumbing down this system and letting small local models play for little power would be really cool.

Built With

Share this project:

Updates