Ghostdiedie
Ghostdiedie turns the laptop webcam everyone already has into a real-time controller for competitive multiplayer fighting.
The Name
The name reflects exactly what the game feels like.
Ghost means your opponent is not physically standing in front of you. You are looking at a screen and throwing punches and kicks into empty space.
Die die comes from the Korean slang 다이다이, which refers to a one-on-one fight, face to face.
Put together, Ghostdiedie means fighting someone who is not present in your real space while still facing them directly in a one-on-one match. That is exactly the feeling the game creates.
Inspiration
Why do games that use real body movement still require expensive hardware?
VR headsets and motion controllers can feel immersive, but they are costly, inconvenient, and hard to set up. Most people cannot casually access that kind of experience. We wanted to keep the feeling of physical movement while removing the barrier.
So we gave ourselves one rule: the only hardware allowed is the webcam on your laptop.
Ghostdiedie started as a challenge. Could we build a real-time multiplayer game that feels physical, social, and exciting using only a browser and a standard webcam? We wanted to prove that immersive interaction does not need specialized gear if the system is designed well enough.
What It Does
Ghostdiedie is a real-time multiplayer 3D fighting game where your body is the controller.
No VR headset. No motion sensors. No download. Just open the URL and play.
You punch, your character punches. You kick, your character kicks. You jump, headbutt, and move in real time against another player online.
The game includes:
- Real-time motion control using a webcam, with actions detected at about 20 fps
- Live multiplayer fighting with server-side hit detection
- Live opponent video during the match using WebRTC
- A full game flow including room creation, arena select, character select, calibration, and best-of-3 fights
What makes Ghostdiedie stand out is that it is not just pose detection connected to animation. It is a complete playable game system built around accessibility, responsiveness, and social interaction.
How We Built It
Ghostdiedie is built as a real-time pipeline that turns body movement into game actions.
Computer Vision
Pose detection runs fully in the browser using MediaPipe with WebAssembly, so there is no Python runtime and no installation required.
We track 33 body points from the webcam at about 20 fps and convert them into game inputs using custom action detectors.
- Punch is detected when the wrist moves into the body area with enough speed
- Kick is detected using hip tilt difference
- Jump is detected when landing
- Headbutt is detected from fast head movement
- Movement is based on body position and shoulder width
Before each match, players complete a short 4-second calibration step. This adjusts the system to their body, distance from the camera, and camera angle so controls feel more reliable.
Game Engine
The 3D game is built with React Three Fiber and Three.js, with physics powered by Rapier.
We built a full multiplayer game flow with:
- Room creation and joining
- Arena selection
- Character selection
- Calibration
- Live combat
- Round logic
- Match results
The game includes:
- 4 character models loaded with
useGLTF - Player face images applied to the character head
- 2 arenas: an underground cage arena and a neon grid arena
Networking
We separated gameplay networking and video networking to keep the game responsive.
- WebSocket handles actions, combat results, and game updates
- FastAPI on the server calculates all damage and enforces combat logic
- WebRTC sends live video between players
- Supabase handles rooms, users, and the leaderboard
By separating game data from video, we made sure that video quality does not interfere with gameplay responsiveness.
Challenges We Ran Into
Moving Everything Into the Browser
At first, we used Python for pose detection. It worked, but it was too heavy and too slow for the experience we wanted. Running Python and the browser together introduced lag and made the project harder to access.
We also tried Electron to bundle everything, but requiring installation went against our core goal.
So we moved the full computer vision pipeline into the browser using MediaPipe. That made the game much faster and much easier to access.
Turning Movement Into Clean Inputs
Raw body-tracking data is noisy. Fast movement can easily trigger the wrong action.
We solved this by building custom detection rules using position, speed, thresholds, cooldowns, and action priority so the controls feel intentional instead of random.
Calibration
Every player is different. Camera angle, posture, and distance from the laptop vary a lot.
We added a short calibration step before each match so the system can adapt to the player and reduce false detections.
Latency
Latency comes from the whole stack: camera capture, pose detection, networking, and rendering.
We reduced delay by:
- Running detection on GPU with WebAssembly
- Detecting jumps on landing for reliability
- Handling player position on the server
Accomplishments
- Built a real-time multiplayer fighting game using only a webcam
- Made it playable directly in the browser with no install required
- Achieved about 65 to 70 ms total latency
- Designed custom action detection that feels responsive and natural
- Built a full game system from login to match result
- Used WebSocket and WebRTC together without conflict
Most importantly, we turned a technical experiment into something that actually feels fun, social, and replayable.
What We Learned
We learned that immersive interaction does not need specialized hardware. With the right system design, a webcam can become a surprisingly expressive controller.
We also learned that real-time systems have to be designed across every layer at once. Computer vision, networking, gameplay logic, and rendering all affect how responsive the game feels.
Our biggest takeaways were:
- You do not need special hardware for immersive interaction
- Real-time systems must be designed together across all layers
- Latency is a full-system problem
- Accessibility has to be built into the design from the start
- Seeing your opponent’s face during the match makes the experience feel much more real
What’s Next
We want to expand both the gameplay and the platform.
New Actions
- Block using crossed arms
- Dodge using side movement
- Combo detection over time
Technical Improvements
- Replace MediaPipe with a custom ONNX model
- Move physics to the server
- Use Redis for scaling
Content
- More characters with different abilities
- More arenas
- Special moves
- Spectator mode
Long-Term Vision
Ghostdiedie can go beyond games.
The same webcam-based interaction system could be used for fitness apps, live performances, and social experiences using only a webcam.
Ghostdiedie started as a fighting game, but the bigger idea is a webcam-native interaction platform that makes physical digital experiences more accessible.


Log in or sign up for Devpost to join the conversation.