An app as an assitive technology to help the visually impaired grocery shopping. We will never understand the struggle blind people go through but know how they are looked down, mocked in some places, even malls, and supermarkets.
What it does
We have created this project to help them while buying foods or other products or even knowing what kind of product is in front of them. Any person can use this handy tool and help us make it better.
How we built it
Scraped a lot of images from the web. https://github.com/DivyendraPatil/google-images-download
Train a CNN (Convolutional Neural Net) using a scraped images using Google's Open Source ML Tool "Tensorflow".
- High-Quality Images from google search filtering for better efficiency of the model being created. In our case, we achieved an accuracy of 80.7%.
The model we created is based on CNN (Convolutional Neural Network) like google’s Inception-v3 model on GPU instance. The model used was for Image classification with 40 diverse food items, with about 100 images for each category with 10% of them as the training set with 22 levels with 8000 set as maximum training cycles.
Imported the model files generated by Tensorflow on GCP into an Android application using Tensorflow SDK. Used NDK for C++ code base for detection which computes faster than Java code thus enabling lag-free calculation using GPU supports.
The images were converted to Numpy array of size 299 x 299 from 720p video frames captured from phone’s camera. Deciding on a threshold of 51% thus not having confusion among two categories by applying minimum distance required for decision set as 10%.
Used TTS (Text To Speech) service for letting the user know what food item has been detected. JNI code base used extensively for such purposes.
Locale Support for different languages has been provided by using Google’s Translate API for initialization. Hence, providing non-network dependent machine learned application.
Links: https://www.tensorflow.org https://cloud.google.com/translate/ https://cloud.google.com/text-to-speech/ https://aws.amazon.com/blogs/machine-learning/how-to-deploy-deep-learning-models-with-aws-lambda-and-tensorflow/
Challenges we ran into
Not being able to sleep doesn't seem much when your code keeps crashing. And when you know that it is because of a corrupt file which you are parsing for Machine Learning, there is not much you can do than just shake your head.
Accomplishments that we're proud of
I'm proud of not attempting to incorporate any of the sponsor companies' APIs just to be eligible for their prizes because that would have caused unnecessary complexity and distracted me from achieving the main goal.
What we learned
Sometimes give other people to look at your code as a second eye.
What's next for Food Detect
Get funding for the same.
AWS can be utilized for training many images quickly
We want to take this fully functional project way more ahead.
Even though this app was prototyped before, we couldn't finish before.
We did this time.