Simon Says is a childhood game where the players act out the commands from "Simon". When Simon "says" to do something you do the action, however when Simon doesn't give the command you should not do the action. Our project is building a Simon Says Deep Learning platform where everyone can join the same global game using Deeplens to verify the correct action of each player.

Fun Fact: The guinness world record for a game of Simon Says is 12,215 people set on June 14, 2007 at the Utah Summer Games.

How we built it

We used two open source projects to help us develop our application, Openpose and mxnet Realtime Pose Estimation. We started by optimizing a pre-trained model to run on intel's deep learning inference engine. We use this model to generate a pose map that shows the position of the body. We used this pose map to classified a smaller network to output the players action.

The global game network is built on top of AWS serverless architecture. We start by firing an event from cloudwatch every minute that activates a series of lambda functions to generate a game. After a new game is generated it is pushed to each devices using IoT message service. The device can use this message to download the necessary audio files, generated with AWS Polly, to start the game. With this architecture we believe it will scale to support a large number of devices.

Challenges we ran into

We faced several challenges while building our project. The most notable was converting the model to intel's inference engine. We started by converting an mxnet model using the python converter but ran into several errors about missing axis attribute for concat layer. After resolving these errors we were stuck with Test failed: This sample accepts networks having only one output which meant we needed to restructure the final output layer or skip running on the GPU. FIX in version 1.2.2 - AWS Support Forums

Accomplishments that we are proud of

There were many times where we felt the project was a lost cause. Between the model not optimizing and interpreting the results, it was a bumpy road to a working demo. The first time seeing the live project feed with the pose map overlaying was finally the point where we knew we could finish what we started.

What's next for Deeplens Simon Says

In the short term we would like to enable multiple pose detection to allow a group of players to play off the same device. The model is already setup to do this but we need to implement this into the game, as for a demo it was easier to setup for one player.

In the long term we envision a training/classification loop where, at random times, the system will ask the player to make a new action. We could then take the generated pose and update our classification weights.

Thinking about the larger picture, this pose estimation and classification has many applications from monitoring children, detecting shoplifting, other types of games like dancing, or maybe even in sport training.

Learn More

We invite you to check out our github repository deeplens-simon-says to learn more about the development of our project.


This project is not to be used for commercial use and you must follow the original license.

Built With

Share this project: