Inspiration
Coming from the crowded cities of Bangalore and Mumbai, we thought about how incredibly difficult it must be for the visually challenged to navigate the world around them.
This thought grew into an idea for an application to much more easily help the blind to interact with the environment around them, at minimal cost.
What better way than to use a device everyone carries around with them. Their phones!
What it does
Lumos helps visually impaired people experience the world around them, seamlessly. All a person needs to do is to point their phone at their surroundings and tap the screen. Lumos then describes their environment and warns them in case there are any dangerous obstacles in their path, including traffic, roads, and stairways using Computer Vision and Speech APIs.
How we built it
Our product is realized using an Android application. While deciding what platform to build on, we went with Android because it is what the majority of people in India (where we're from) use.
Our fronted consists of a lightweight android application listening for a click. Once a picture is taken, the image processing and detection are handled by our backend infrastructure.
Using the Google Cloud Vision API, we detect relevant objects in the image before assigning our own relevance score to them. Post this, we generate a relevant phrase that we predict might be useful to the user and make use of Text to Speech APIs to relay this information back to the user in the form of a short audio clip.
Challenges we ran into
The major challenge was encapsulating external features nicely within the Android application. As first time hackers, we completely underestimated how difficult this would be and this took up a majority of our time.
In addition to this, extracting the correct information from the Vision APIs was also a challenge, but one we faced up to quite bravely.
Accomplishments that we're proud of
When we started the project, we decided to keep two broad concepts in mind: efficiency and simplicity.
We're proud to say this is reflected in our entire application. Our application runs with minimal latency and has an extremely simple and intuitive user experience.
We accomplished the first by using compression algorithms for the images that are taken, thus reducing the time for API calls. We also store nothing on the user's phone.
The second we accomplished by sticking to our minimal and simple design principles, only having one screen and no complex sequence of commands, thereby creating a lightweight and user-friendly application.
What we learned
We learned that building an application that uses external APIs and must work in real time is an incredibly challenging task.
We also learned how to collaborate while working on an application of this scale, maintaining good communication among team members (as well as good coding practices!). Two of our teammates also learned Android development from scratch.
What's next for Lumos
We believe that Lumos has a lot of potential for growth and that is why it was so important for us to build a basic working prototype.
We plan to incorporate a more sophisticated backend infrastructure for detecting objects in images as well as rendering and organizing this information more effectively. We are also thinking about ways to pre-process data in a better way, thus keeping latency time low and preserving as much useful information in the data as possible.
Another huge addition we thought was beyond the scope of this hackathon, was to make Lumos more personalized to the user, detecting faces of friends and family that the user meets etc.
Built With
- android
- android-studio
- google-cloud
- google-vision
- java
- text-to-speech
Log in or sign up for Devpost to join the conversation.