The talks held before the commencement of the hackathon were about security and hacking. That led to us delving into the world of Face ID (not the most ideal derivation from the above topics, but nevertheless piqued our curiosity). This further broadened our horizons: our ambition now was to create a centralized vision classification and detection model. We also implemented audio interpretation to look for auditory cues in our data-sets as icing on the cake.
What it does
It consists of authentication (User ID and password) to login to the software using user-registered accounts. This will let users train facial data, predict identity, classify emotions and even parse words in an audio file.
How we built it
Python was our preferred language of choice due to its convenience, easy deployment, abundance of libraries and the ability to dish out code fast. OpenCV was our ally in image manipulation while the Speech-Recognition library along with Google Web Services and Firebase catalyzed code production and feature enhancement.
Challenges we ran into
Initially, emotion classification was a hurdle as the data set that the model was trained on consisted predominantly of Caucasian faces. This however did not prove to be that big of an issue as it could be solved with a little tweaking. The feature rich nature of our hack lead to a larger than expected code base that was spread across various files. We solved this by creating classes wherever we could and completely utilizing the modular nature of coding. The fact that most of the code was localized led to drastically faster run-times. Facial identification was also a problem as we had a plethora of options and techniques to adopt. We settled on combining Haar Cascades and an LPBH (Local Binary Pattern Histogram) recognizer to form a lethal and accurate combination for face identification. The issues with this technique were susceptibility to rotational inaccuracies. However, the strengths like accuracy and speed far outweighed such weaknesses. This method could also be done with a small data-set (20 images). User-Interface design using Tkinter turned out harder than expected, prompting us to work with Kivy which threw many setup errors. The unstable internet connection posed a challenge with our IoT implementation. However, we eventually overcame this challenge by managing our data in innovative ways.
Accomplishments that we're proud of
We successfully implemented all the aforementioned features and even added a touch of IoT using Firebase. This opens up a world of possibilities in expanding this project for further use. The short time-frame of this hackathon certainly amped up productivity, but one can only wonder how wonderful a larger and calmer period would be. Juggling libraries, files, code and dependencies was challenging, however we feel we have done a good job at it.
What we learned
We learnt to work with various new concepts like Tkinter, FireBase, SpeechRecognition, OpenCV. We learnt to write more modular code for faster outputs and greater readability of code. Centralizing features also showed us the power that is still waiting to be harnessed in the field of computer vision.
What's next for IcePy
IcePy is an idea that is evergrowing in its nature. It aims to be the ideal, always changing piece of software that encapsulates all the great concepts and ideas possible in various computing fields. It can maybe act as an API for developers to be a one stop center of everything face and audio detection. It can be used in making tools to help actors enact emotions and practice, for this accurate computer program may well be better than a mirror in aiding budding actors. Kids can also be fascinated by the fact that their emotions and sounds are being recognized. IcePy is the ONE TOOL that we not only deserve, but also need.