Inspired by how much machine learning is underused in modern applications, and how it could play a significant role in aiding the live of the visually impaired when mixed with other technologies.
What it does
The user just point the camera in a general direction and taps to take a picture. It sends it to a server, which uses tesseract (An OCR library using machine learning and pre-trained examples) in Python to convert the image to text. If no image is found, then the app keeps taking images. If it receives text from the server, it uses a Text-To-Audio library to convert make it audible for the user.
How we built it
Using Android Studio with a flutter plugin, as well as a flask server running a python script, which runs a tesseract OCR library. Google's TTS is used for the Text-To-Sound conversion.
Challenges we ran into
-Initially with Python using a Kivy framework. However, due to the lack of support for iPhone, poor documentation and various lack of supports, development was switched to Android studio using Flutter. This allowed for greater flexibility, and guaranteed a better documentation. -Time was a huge factor, initially it was planned for the app to have a connection to the Google Maps API and be able to give the user directions, but it would take too long. Further, we attempted to train the machine learning algorithm, it took too long to do simply 3 fonts. -Further, there was a plan to make the app take photos periodically and talk if any text was detected. This was done in Python but Flutter does not support this behaviour.
Accomplishments that we're proud of
The server provides backend support for the app, and it took a lot of time to setup. It can be expanded to provided additional functionality in the future.
What we've learned
Machine learning has a lot more support than initially thought. It is extremely easy for any individual to set it up with your own training data is extremely versatile; it can be used for a wide variety of purposes.