Project Occuli

Architecture of our application
The visually impaired person would wear the phone in this manner.
Recognition_mode
Detection_mode
Detection_mode

Inspiration

Identity theft is a problem that affects almost everybody, but the visually impaired is affected more by it. Our android application aims to solve this problem by using face recognition and object detection.

What it does

This is the general flow of our app: The visually impaired person wears the phone like an id card with the screen facing outside. Earphones are also connected to the app.

Detection mode:

By default, the app is in detection mode. If a known Person approaches the visually impaired person, the camera picks up the persons face and sends the person’s name as an audio message to the visually impaired person.

Recognition Mode:

If an unknown person approaches the visually impaired person, the visually impaired person clicks the volume down button(switches to recognition mode) and gives the phone to the unknown person who would then state his name and then the phone’s camera will scan his face and add it to the list of known faces.

Sentry mode:

Another important feature of this app is the object detection built into it. When the visually impaired person clicks the volume up button, the app switches to sentry mode in which this person can point to one direction and take a photo and then the app would send an audio message explaining about all the things in that direction. (ex: There are 2 persons in that direction)

How we built it

The sentry mode uses a single-shot object detection model trained on the COCO dataset built with PyTorch to detect objects. gTTS was used for text to speech and it has been deployed as an API in google’s compute engine.
The Detection mode and recognition mode was built with Java, c++, javacv, tensorflow, okhttp, camera2api,

Challenges we ran into

Implementing a face recognition system into our app that runs natively.
Merging all the features into one app.
Combining speech to text API and hotword detection into one activity.

Accomplishments that we're proud of

Setting up face detection and recognition that works on the mobile itself without the need for an external API.
Deployed a flask API to google cloud that can run single-shot detection on an image and return an audio message that points out what objects were present in that image.

What we learned

Deploying PyTorch models to the cloud.
Implementing face detection and recognition that runs locally on the android app.
Implementing Speech recognition on an app.

What's next for Project Occuli

To conduct a survey among the visually impaired people to understand their difficulties so that we can fine-tune our object detection model according to their needs.
Improve Scalability of the app.
Find a way to attach more physical buttons to the phone to make the user experience smooth and seamless.
Add an audio-based virtual assistant that can communicate with the visually impaired person and make the UX even better.
Add a character recognition system to the object detection model so that any text found in the sentry mode is conveyed as audio to the visually impaired person(ex: text in billboards and signs.).

Built With

android-studio
c++
camera2api
flask
google-cloud
gtts
java
javacv
okhttp
python
pytorch
tensorflow

Submitted to

Global PyTorch Summer Hackathon

Created by

I helped build to build the android application and we were successfully able to add facial recognition which successfully ran locally on mobile phones.

Bharath Nair
I worked on Android app.

Xiaowei Wang
Yes, I know, I don't look like fred but i am fred
I helped build the web API used to run inference on images to detect objects.

steve paul

Updates

Bharath Nair started this project — Sep 15, 2019 03:32 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.