Hackathon API's Used:
- Microsoft: Cognitive Services - Vision, Speech, and Language + Azure Bot Service
- Facebook: Graph API, Live Video API
Tommy Edison, a visually impaired YouTuber, described in a video a question he had for a sighted person: “What is it like to walk into a room and instantly know what and where everything is?”
In an age where Computer Vision and Machine Learning can identify the contents of a room with confidence, it is essential that we use technology to aid the visually impaired. InVision aims to increase the quality of daily life of the visually impaired through navigational features, contextual awareness (people and objects), and way of connecting to people for visual assistance. The integrated voice assistant allows an intuitive back-and-forth interface to switch between features.
What it does
InVision is a proof of concept vision aid headset for the visually impaired.
It is a toolset for assisting the blind, including navigation, contextual awareness, a way of connecting to people for visual assistance, and a personal voice assistant. Our collection of hardware, including several cameras, vibration motors, and sensors is what enables these apps. The use-case of the hardware extends beyond just the demo apps made during the hackathon.
InVision has 3 main tools, and a smart assistant to guide the whole process.
Navigating, especially in new places is a difficult challenge for the visually impaired. InVision is able to provide a fully guided path using ARKit, and relays that information to vibration motors to communicate directions. An iPhone provides user’s position, rotation, and target direction for navigation. Additionally, this has the additional functionality of providing the necessary collision detection to walk at a reasonable pace.
Connect to Others
I’m trying to use a vending machine, but I can’t tell what to do, oh no! Ask our virtual assistant, and it will connect you to someone that can watch your video stream and help you complete a task.
Walking into a new room, visuals is one of the most important signals for contextual awareness.
InVision responds to the simple requests “what is in front of me?” or “who am I facing?” in a very intuitive manner. Using cloud object recognition, InVision is able to identify all objects in the perspective of the wearer, and speak them out with a friendly voice. If a wearer happens to be facing a facebook friend, InVision will identify that person for the wearer.
A virtual assistant is what binds together all the apps, and what allows for a seamless interaction via voice commands.
How we built it
InVision was constructed using the frame of a pair of safety glasses, a USB webcam, vibration motors, LEDs, and a Raspberry Pi. A chest harness connects a phone used for navigation.
Challenges we ran into
Building InVision was no easy task. It was a very ambitious idea comprising of many different components: both software and hardware. In order for the headset to function, each member in our group had to individually succeed in their according modules.
Hardware was tricky. At one point we thought we had a misunderstanding of transistors, but it turned out that the transistors we were using were just broken.
Accomplishments that we're proud of
- The headset looks pretty cool
- We were able to divide tasks in an efficient manner
- Everything we said we wanted to do we accomplished
What we learned
Temperature sensors are interchangeable with PNP transistors Working with Microsoft Azure Cognitive Services
What's next for In Vision
Use a Pi zero and 3D printed case for a more sleek design Expand on current apps; find a better method of collecting data for navigation, and build an incentive system for people to sign up to be a visual assistant (mechanical turk?)