The Haptic Vision Team!
The Entire Training and Detection Setup
Schematic for the Vibration Motor Circuit
Brandon plugged into the Matrix
Haptic Vision provides a low-effort, high-impact way to learn to see and communicate unobtrusively, especially for the visually impaired.
Passive Haptic Learning (PHL) allows people to learn muscle memory through vibrations while paying little to no attention to the stimuli. With the help of cheap ubiquitous electrical elements like microcontrollers and vibration motors, anyone can easily build a system to passively acquire tactile skills like learning how to play an instrument, reading braille, and using morse code. An offshoot of PHL is Passive Haptic Rehabilitation (PHR), which has allowed patients with spinal cord injuries to more than double their recovery rate with the same equipment.
Sensory Substitution is the effect that one type of sensory stimuli can be changed into another type of sensory stimuli. For example, visual information can be conveyed and perceived through our other senses, such as sound or touch. In this project, we substituted one's sense of sight with their sense of touch, allowing people to "see" and "listen" with their skin using a haptic encoding from the vibration motor.
We first learned about Passive Haptic Learning when Sarthak started working with Wearable Computing research at Georgia Tech. After conducting research and running experiments on its effectiveness, we were all amazed by how quickly the human brain was able to learn the encoding pattern which transformed the visual input through sensory substitution into the replacement vibrations. We realized that the potential number of meaningful applications was tremendous.
With Sarthak and Ralph familiar with the timeline of hackathons, we drew on their collective experience and decided that the time window given was a perfect way to test the practicality of our project.
What It Does
Haptic Vision consists of three stages – a training stage, an object detection stage, and the blind communication phase.
In this stage, we teach the brain to make associations between the seemingly random vibration patterns felt on a forehead with the correct letters. We do this by first playing the sound of the letter we want the trainee to learn, and follow up with the vibration pattern corresponding to that character.
Object Detection Stage
In this stage, we use a deep convolutional neural network consisting of 32 different layers to process images captured from our laptop and understand the semantic contents of the image. The main job of the neural network is to classify images into a wide range of existing categories we used the same categories as the VGG-16. The network is able to understand what those objects are by continuously extracting higher level features from the pixels in an image. The first layers extract information like edges and shapes while the last layers contain much more abstract information about object positioning.
We set up an experiment that could simulate conditions in which people with visual deficiency live in. We believe that this technology could potentially be used in the future to enable those people to be able to perceive objects through a haptic sensory extension. In order to do so, we place objects in a box, completely off the sights of our subject. Then, a camera is able to use AI to detect which objects are in the box and pass a signal to the haptic module. This enables the blind person to "see" and understand what those objects are through the haptic model, effectively restoring their capacity to perceive objects again
Blind Communication Phase
How We Built It
The hardware circuit consists of a Arduino Uno controller, breadboard, and vibration motor. The electronic circuit was then securely attached to a hat, allowing for ease of wear for our testing subject.
Software to Communicate with Arduino
The Arduino itself runs in Arduino, which is in the family of C languages. We made a Python virtual environment to communicate with the Arduino using serial communication. Python acts as the gluing component in our software architecture, allowing the project to take information from the neural network and feed it into the Arduino. We used the pySerial github repository to aid with this cross-device communication.
VGG16: 32 Layer Convolutional Neural Network whose architecture was originally envisioned by the VGG team.
Challenges We Ran Into
We had difficulty building the neural network to decipher handwritting and recognizing objects that weren't on the ImageNet dataset.
There were several Arduino components that we could not use in this project due to lack of access.
We weren't able to get access to the Predix platform in time to deploy our Neural Network model to the cloud. That was unfortunate, since we believe that this could have had a great impact in the future. The AI software could have been used by multiple IoT devices and the platform could be used to constantly improve the quality of our service.
Accomplishments We Are Proud Of
We actually managed to train someone in a realistic setting! It only took a few hours for our volunteer to be able to decode the sensory input from the vibration motor into a meaningful substitution for vision. We were also able to enable people to directly write messages to Brandon through their computers. This means that, in the future, this technology could be used a a hands free form of messaging. The technology was also promising at object recognition, enabling Brandon to effectively "see" which objects were in the box through the device. This means that a future iteration of this project could be fully used to help blind users navigate their environments effectively.
What's Next for Haptic Vision
We plan to do further research on how much better this hack can get with the integration of multiple tactile motors in it. Brandon was able to easily read sentences and identify objects with only one motor for encoding the information, so we wonder how much better can we do if we had access to more motors in either a vest or belt using a grid design. We want to know if this could enable us to encode messages faster to our users or even pass in more complicated information. One cool application that could follow for a better haptic module would be teaching people how to sing or play musical instruments better, for example