Inspiration
Our team has Aveeno-soft hands, and we'd like to keep it that way. Our 'Throw Hands' Boxing Simulator is a great way to release your inner Rocky Balboa without callusing your plush paws.
What it does
Throw Hands is a machine-learning enabled boxing game, controlled entirely via a camera feed on your computer. We built a new 3D game engine from the ground up, integrating computer-vision based controls. The game allows a player to 'punch' a virtual 'trainer' without any extra depth sensors or special virtual reality hardware. Each hit on the training pads corresponds to one point.
How we built it
We used a mix of Python, Google's MediaPipe Library, the OpenCV open source computer vision library and the JavaScript-based 3D rendering framework known as Babylon.js. Using OpenCV's video stream capture functionality, individual frames were color-graded, rotated, cropped and converted into NumPy-based floating point arrays. These calibrated frames were then pushed through a customized version of Google's MediaPipe framework, where a set of (x,y,z) coordinates were generated. To maintain accuracy and positional consistency, we then perform a series of averaging and smoothing operations on the coordinates. The coordinates are then scaled and mapped in real time to the arbitrary coordinate system within our Babylon.js simulation implementation
Here's a demonstration of some of the gesture and landmark recognition capabilities of OpenCV + MediaPipe:

Challenges we ran into
Initially, we planned on making Boxing Simulator a multiplayer game by using Firebase to transfer data between multiple players in real-time. However, there was severe latency, which caused a noticeable lagging effect on the boxing gloves, as network delays and input lag would cascade throughout the entire system if not properly mitigated. We responded by quickly rebuilding much of the backend pipeline to keep as much data as possible locally, which drastically improved input lag. Additionally, we found that our Babylon.js based system was heavily single-threaded, resulting in performance bottlenecks when running the simulation. We partially mitigated this by moving to a higher performance test bench to take advantage of stronger single core performance. This issue was due the fact that both the game logic thread and the machine learning inference workloads were on the CPU. In the future, we hope to explore GPU-based acceleration in order to reduce processing latency and improve performance.
Accomplishments that we're proud of
- Combined prior frontend experience with computer vision
- 'Throw Hands' was our first foray into video game development.
- Made a game based on human movement controls without the usage of expensive sensors or controllers
- Explored a cutting edge machine learning framework
Video of Low-FPS Scenario Demo - https://youtu.be/j1mUGi7hscI

What we learned
We need to anticipate network and hardware limitations in designing our projects. Also we would definitely have operated more efficiently if we assigned ourselves specific roles before starting.
What's next for Boxing Simulator
Next up: a multiplayer mode that allows players to box each other in the game. Also, customizable gloves!
Box Away!

Built With
- babylon
- computer-vision
- javascript
- machine-learning
- mediapipe
- opencv
- python


Log in or sign up for Devpost to join the conversation.