Guidance systems for the visually impaired that perform functions such as facial recognition of known people have been developed in the past, but are restrained only to a functionality in specific environments and situations. The inspiration of this project came from the need of a universal guidance and detection system for people with visual impairments that is not limited by factors like the familiarity of the location.
What it does
This system classifies a variety of objects a person may encounter through daily traversals and verbally notifies the user about the presence of these objects. The end user component of the system is a mobile application that continuously operates as the user journeys through his/her immediate environment. The app periodically captures images, sends the image to a remote classification engine, and upon receiving the results informs the user the presence of lack thereof of common objects.
How I built it
The classification component of the system makes use of a convolution artificial neural network that was designed and trained the network using the deep learning library Dl4j. It was trained on a database of pictures of a variety of commonplace objects like laptops and furniture. The entire classification component of the project consisted of this neural network merged with a tomcat web server that was built to accept remote post requests with an image and returned the classification result. The front-end was built with the use of a mobile application that continually took pictures of person's surroundings when the app was activated.
Challenges I ran into
Two main challenges we ran into were the actual training of the neural network in charge of image classification and the integration with the web server. With work into finding a good dataset of images and training parameters along with resolving server integration issues I was able to resolve these challenges and ultimately build a smoothly running system.
Accomplishments that I'm proud of
Merging the system with a lightweight mobile application but still retaining full classification power. The scalability of the system also proved to be high allowing for continual sophistication of the system to be made.
What I learned
Through this project I was able to experiment with new networking and web server concepts along with improving skills in end to end project development. Due to the necessity of integrating the system with a lightweight end user interface, I gained experience in working with limited resource environments.
What's next for Pixel Perception
I plan on hosting the classification engine on a public web server allowing for full functionality of the system regardless of location. Further I plan on making the classification engine more powerful and able to take user input into what types of images the system should be able to classify, such as non-commonplace objects that be continually seen by a minority of the users.