Brainstorming what we could do with ML and TF2, the idea came up of using ML to turn any mobile device to a Wii Remote. Using the device's accelerometers and gyroscope sensors, we could detect gestures in real-time and link them to actions in the game, taking the experience of mobile+screen games (such as Kahoot) to the next level: instead of having to interact with an app on the device and look at the main screen at the same time, you can focus entirely on the screen and turn your device to a remote control, potentially making any screen an advanced gaming console with gesture controllers.
What it does
We designed a network using TF2 to learn gestures based on the phones accelerometer sensors and then recognise them live. To demonstrate it we built a fun multiplayer game in which two players login to a game room and fight each other's dragons using the gestures the network recognises.
How we built it
The game itself is built using Unity and a flask server that runs the game mechanics and listens to the players' actions. The network was coded with TF 2 and the mobile controller is a webpage hosted statically by Firebase, which uses TensorFlow.js to serve the model and recognise gestures in real-time.
Building the model
The original plan was to use both accelerometer data and gyroscope data sampled at a relatively high rate (~50Hz) to make the recognition accurate. The model consisted of several 1-dimensional convolutions followed by two LSTM layers. This model was very accurate, but unfortunately running it in real-time wasn't practical, especially on a mobile device. We settled for a lower sample rate of 10Hz of accelerometer data only and a simpler model, consisting only of 1-dimensional convolutions, which still seems to have good enough accuracy while having great mobile performance, allowing real-time gesture recognition.
Since the gestures are continuous in time, instead of building a classifier and trying to predict a single moment as the moment when the gesture happened, we labelled the dataset with real-valued confidence scores for each of the classes, which rise linearly from 0 about 400ms before the "peak" of the gesture to 1 at about 100ms after the "peak".
The model's structure can be seen in the project images, as well as its performance on the test set, which includes for each class: the correlation between actual score and predicted score (top row) and the event timeline, with peaks for the actual and predicted confidence scores (bottom row).
What's next for Game of Tensors
We intend to make this game commercial, designing more intricate gestures, having more abilities and being able to host multiple games at once. We'll finalise the gesture learning project and make it developer friendly so everyone can use it to develop their own gesture based games. In addition we'll create a react native package so that mobile developers can add basic pre learned gestures to any app or combined with the learning framework invent new complex interactions never seen before on mobile phones.
Try out the gesture model!
We created a simple web page (hosted on Firebase) to showcase the gesture model. It currently works for Android only, and it's far from perfect (worked for around 70% of devices we tried). Go on, give it a try at https://tensor-game.firebaseapp.com/gesturedemo.html ! (Use Google Chrome on Android with your volume up)
We can't wait to see what the community will do with this!