InVision

Hackathon API's Used:

Microsoft: Cognitive Services - Vision, Speech, and Language + Azure Bot Service
Facebook: Graph API, Live Video API

Inspiration

Tommy Edison, a visually impaired YouTuber, described in a video a question he had for a sighted person: “What is it like to walk into a room and instantly know what and where everything is?”

In an age where Computer Vision and Machine Learning can identify the contents of a room with confidence, it is essential that we use technology to aid the visually impaired. InVision aims to increase the quality of daily life of the visually impaired through navigational features, contextual awareness (people and objects), and way of connecting to people for visual assistance. The integrated voice assistant allows an intuitive back-and-forth interface to switch between features.

What it does

InVision is a proof of concept vision aid headset for the visually impaired.

It is a toolset for assisting the blind, including navigation, contextual awareness, a way of connecting to people for visual assistance, and a personal voice assistant. Our collection of hardware, including several cameras, vibration motors, and sensors is what enables these apps. The use-case of the hardware extends beyond just the demo apps made during the hackathon.

InVision has 3 main tools, and a smart assistant to guide the whole process.

Navigation

Navigating, especially in new places is a difficult challenge for the visually impaired. InVision is able to provide a fully guided path using ARKit, and relays that information to vibration motors to communicate directions. An iPhone provides user’s position, rotation, and target direction for navigation. Additionally, this has the additional functionality of providing the necessary collision detection to walk at a reasonable pace.

Connect to Others

I’m trying to use a vending machine, but I can’t tell what to do, oh no! Ask our virtual assistant, and it will connect you to someone that can watch your video stream and help you complete a task.

Contextual Awareness

Walking into a new room, visuals is one of the most important signals for contextual awareness.

InVision responds to the simple requests “what is in front of me?” or “who am I facing?” in a very intuitive manner. Using cloud object recognition, InVision is able to identify all objects in the perspective of the wearer, and speak them out with a friendly voice. If a wearer happens to be facing a facebook friend, InVision will identify that person for the wearer.

Virtual Assistant

A virtual assistant is what binds together all the apps, and what allows for a seamless interaction via voice commands.

How we built it

InVision was constructed using the frame of a pair of safety glasses, a USB webcam, vibration motors, LEDs, and a Raspberry Pi. A chest harness connects a phone used for navigation.

Challenges we ran into

Building InVision was no easy task. It was a very ambitious idea comprising of many different components: both software and hardware. In order for the headset to function, each member in our group had to individually succeed in their according modules.
Hardware was tricky. At one point we thought we had a misunderstanding of transistors, but it turned out that the transistors we were using were just broken.

Accomplishments that we're proud of

The headset looks pretty cool
We were able to divide tasks in an efficient manner
Everything we said we wanted to do we accomplished

What we learned

Temperature sensors are interchangeable with PNP transistors Working with Microsoft Azure Cognitive Services

What's next for In Vision

Use a Pi zero and 3D printed case for a more sleek design Expand on current apps; find a better method of collecting data for navigation, and build an incentive system for people to sign up to be a visual assistant (mechanical turk?)