As we walk the campus at RIT, we come across people that are hard of hearing. Communication is a vital social interaction amongst all people, and we plan to bridge the gap between the deaf and the hearing community with TigerAssist.
What it does
It live tracks hand gestures, processes it and converts it into text.
How we built
Initially, a neural network was initialized and preprocessing tools were coded. The math behind the algorithms and functions was performed and corresponding values for the parameters were determined and fed to the neural network. Image processing was utilized to determine hand gestures from two separate sources of video feed and a machine learning algorithm predicted the hand gesture based on a pre-trained model.
Challenges we ran into
Blue screen of death. Machine learning algorithms weren't compatible with the version that worked with Leap API. Lack of datasets to train the machine on. The camera and the LeapMotion VR could not be connected simultaneously, resulting in invalid input cases and crashing the script. Different depth read values for webcam and VR. Differences in background, hand and light sources.
Accomplishments that we're proud of
Github helped maintain the backup during blue screen. Incompatibility was fixed by compiling a set of wrapper classes with SWIG and GCC that generated a LeapPython extension that contained the required libraries. Created our own dataset, despite of creation time. Assigned different ports and Video capture port values to both the sources. Using a white board to create a contrasting background for accurate input. Use of Gaussian blur.
What we learned
DO NOT RUN A DB SERVER, 2 TENSORFLOW SCRIPTS, AND A TRAINING SCRIPT AT THE SAME TIME (blue screen!!!!!). Creating Python33.dll with C++ scripts, to make packages compatible with other python versions. Numpy is super incompatible with Python 2.7 while Leap is native to 2.7. (Argghhh)
What's next for TigerAssist
Make camera wait until hand is detected Train with larger dataset to recognize more Hand Gestures Add a text to speech feature for virtual communication Port python backend to be used in Android app for portability Better classifier between light sources and white background