Beach Box

Inspiration

Our inspiration stems from the nostalgia of playing Wii Sports and Wii Fit. We want to wanted to develop a project that promoted health with a nostalgic feeling supported with modern technologies like AI while removing console requirements.

What it does

Beach Box is a 2-player, webcam-controlled fighting game in the browser MediaPipe reads your pose and hands in real time, we solve it into a rigged humanoid that mirrors your moves in a 3D arena, and Supabase Realtime syncs the two players so your punches actually land on your friend across the internet. Every match is also recorded in 5-second clips, embedded with Gemini, and made searchable so you can ask "show me the one where I threw a big combo" and watch it back.

How we built it

Next.js 16 + React Three Fiber for the app and 3D arena, MediaPipe Tasks Vision for in-browser pose + hand tracking, and Supabase (Postgres for rooms/matches/clips + Realtime for per-tick pose sync + Storage for .webm chunks). Zustand holds all client state as a flat set of stores (game, pose, camera, sound, view, guard), and Gemini 2.5 does the video embedding + natural-language query planning for the replay search. Damage, collisions, and guard state are all computed locally on each client and reconciled via broadcasts.

Challenges we ran into

Matching 3D model rig to human arm movement
Consistent punch detection mechanisms
Multimodal media structure

Accomplishments that we're proud of

We are proud of our implementation of motion capture without the utilization of popular hardware tools like LiDAR and depth measurement sensor. Additionally, our world model and interaction designs exceed our expectations given our limited proficiency in 3D modeling and animation.

What we learned

We learned to overcome technical hardware challenges using out of the box methods (specifically introducing depth with only 2D coordinates). Learning how to embed videos into vector databases.

What's next for Beach Box

Our plans for Beach Box include integrating between depth detection heuristics or incorporating depth based machine learning models to predict the depth. Additionally we would like to have more special signature moves or interactions for our avatars as well as custom maps.

Built With

gemini
mediapipe
nextjs
r3f
supabase
typescript
zustand

Submitted to

Hook 'Em Hacks
- Winner Best Use of Supabase

Created by

Mediapipe rigging, implementing punching heuristics, guard/punching animations, Supabase multiplayer websockets, combat mechanics, live debug tooling

Tuan Nguyen
I worked on the cv rigging of the arms and all the background React 3 Fiber models.

Ethan Hoang
Computer vision, punching mechanics, and punching detection

Joseph Lacsamana
I worked primarily on the multi-modal system where as users were playing, it would generate clips and use Google Gemini to embed descriptions and for a standard english to query engine. I also did a large portion of the UI. I did a lot of the quality of life features such as making the software completely hands free through hand gestures and calibration. I set up the supabase db, object storage for the clips, and webhooks for the embed runs.

Thomas Petersen