Inspiration

Our inspiration stems from the nostalgia of playing Wii Sports and Wii Fit. We want to wanted to develop a project that promoted health with a nostalgic feeling supported with modern technologies like AI while removing console requirements.

What it does

Beach Box is a 2-player, webcam-controlled fighting game in the browser MediaPipe reads your pose and hands in real time, we solve it into a rigged humanoid that mirrors your moves in a 3D arena, and Supabase Realtime syncs the two players so your punches actually land on your friend across the internet. Every match is also recorded in 5-second clips, embedded with Gemini, and made searchable so you can ask "show me the one where I threw a big combo" and watch it back.

How we built it

Next.js 16 + React Three Fiber for the app and 3D arena, MediaPipe Tasks Vision for in-browser pose + hand tracking, and Supabase (Postgres for rooms/matches/clips + Realtime for per-tick pose sync + Storage for .webm chunks). Zustand holds all client state as a flat set of stores (game, pose, camera, sound, view, guard), and Gemini 2.5 does the video embedding + natural-language query planning for the replay search. Damage, collisions, and guard state are all computed locally on each client and reconciled via broadcasts.

Challenges we ran into

  1. Matching 3D model rig to human arm movement
  2. Consistent punch detection mechanisms
  3. Multimodal media structure

Accomplishments that we're proud of

We are proud of our implementation of motion capture without the utilization of popular hardware tools like LiDAR and depth measurement sensor. Additionally, our world model and interaction designs exceed our expectations given our limited proficiency in 3D modeling and animation.

What we learned

We learned to overcome technical hardware challenges using out of the box methods (specifically introducing depth with only 2D coordinates). Learning how to embed videos into vector databases.

What's next for Beach Box

Our plans for Beach Box include integrating between depth detection heuristics or incorporating depth based machine learning models to predict the depth. Additionally we would like to have more special signature moves or interactions for our avatars as well as custom maps.

Built With

Share this project:

Updates