Smart homes are cool, voice recognition is getting more and more accurate... But what about those how can't use their voice? What about a really noisy environment? The development of a gesture-controlled smart-home assistant might be an underestimated trend.
What it does
The model detects and recognizes the real-time hand gestures of users, and the output from gestures are applied to trigger different electronics. Here, we use LED lights to demonstrate the various types of electronics in a smart home.
How we built it
We start from an image recognition model (vgg19) which was pre-trained on the gestures by Brenner Heintz (https://github.com/athena15/project_kojak). We then built a script by combining Python OpenCV, vgg19 model, and Arduino Sketch which can achieve the following functions:
- The OpenCV module captures, processes the input video
- The processed gesture signals are feed into the vgg19 network for the classification task
- The outputs from the model are taken to trigger the Ardunio which interfaces with the electronics
Challenges I ran into
The pre-trained model was learned from clean binary photos (only black and white pixels), thus it wasn't robust when applying in the real-world; the background light condition seriously degrades its performance. We had to change how the images were processed and fed into the model in order to overcome this issue.
Accomplishments that I'm proud of
We are proud of the accomplishment in two aspects:
- We overcome the abovementioned issue by coming up with a solution for data processing. Based on the motion detection algorithms in OpenCV, we developed a reliable script that can process the input video and capture the correct hand gestures. The script is tested under different light conditions and shows good performance.
- We integrated the ML model with Ardunio which makes it possible to interact with the real-world. This demonstrates its potential to be really deployed at home in the future.
What I learned
In this CodeJam we had gain experience in various aspects:
- We learn how to interface Python code to Ardunio by utilizing the pySerial module.
- We get the knowledge of the architecture of different image recognition models and had a taste of the importance of the training dataset.
- We learned how a gesture recognition system works and how to reduce the effect of background noise.
What's next for Eleven
The next step for Eleven has three phases:
- Currently, the distribution of the test data is different from the training data since we are using a model pre-trained on a clean (binary pixel) dataset. We are planning to photo real-world input data and use techniques such as data augmentation to re-train the model. This will greatly improve its performance when deployed in the real world.
- Instead of using Ardunio as a demonstration, we are considering using this model to really control the electronic devices at home.