Foreword

Before we begin, a very big thank you to the NVIDIA Jetson team for their generosity in making this project submission possible.

Inspiration

Nearly 100% of sight-assistance devices for the blind fall into just two categories: Voice assistants for navigation, and haptic feedback devices for directional movement. Although the intent behind these devices is noble, they fail in delivering an effective sight-solution for the blind.

Voice assistant devices that relay visual information from a camera-equipped computer to the user are not capable of sending data to the user in real time, making them very limited in capability. Additionally, the blind are heavily dependent on their hearing in order to navigate environments. They have to use senses besides vision to the limit to make up for their lack of sight, and using a voice assistant clogs up and introduces noise to this critical sensory pathway.

The haptic feedback devices are even more ineffective; these simply tell the user to move left, right, backwards, etc. While these devices provide real-time feedback and don’t introduce noise to one’s hearing like with the voice assistants, they provide literally no information regarding what is in front of the user; it simply just tells them how to move. This doesn’t add much value for the blind user.

It's 2021. Voice assistant and haptic feedback directional devices are a thing of the past. Having blind relatives and friends, we wanted to create a project that leverages the latest advancements in technology to create a truly transformative solution. After about a week's worth of work, we've developed OptiLink; a brain machine interface that feeds AI-processed visual information directly to the user's brain in real-time, eliminating the need for ineffective voice assistant and directional movement assistants for the blind.

What it does

OptiLink is the next generation of solutions for the blind. Instead of using voice assistants to tell the user what’s in front of them, it sends real-time AI processed visual information directly to the user’s brain in a manner that they can make sense of. So if our object detection neural network detects a person, the blind user will actually be able to tell that a person is in front of them through our brain-machine interface. The user will also be able to gauge distance to environmental obstacles through echolocation, once again directly fed to their brain.

Object detection is done through a camera equipped NVIDIA Jetson Nano; a low-power single board computer optimized for deep learning. A Bluetooth enabled nRF52 microcontroller connected to an ultrasonic sensor provides the means to process distances for echolocation. These modules are conveniently packed in a hat for use by the blind.

On the Nano, an NVIDIA Jetpack SDK accelerated MobileNet neural network detects objects (people, cars, etc.), and sends an according output over Bluetooth via the Bleak library to 2 Neosensory Buzz sensory substitution devices located on each arm. These devices, created by neuroscientists David Eagleman and Scott Novich at the Baylor School of Medicine, contain 4 LRAs to stimulate specific receptors in your skin through patterns of vibration. The skin receptors send electrical information to your neurons and eventually to your brain, and your brain can learn to process this data as a sixth sense.

Specific patterns of vibration on the hands tell the user what they’re looking at (for example, a chair will correspond to pattern A, a car will correspond to pattern B). High priority objects like people and cars will be relayed through feedback from the right hand, while low priority objects (such as kitchenware and laptops) will be relayed via feedback from the left hand. There are ~90 such possible objects that can be recognized by the user. Ultrasonic sensor processed distance is fed through a third Neosensory Buzz on the left leg, with vibrational intensity corresponding to distance to an obstacle.

How we built it

OptiLink's object detection inferences are all done through the NVIDIA Jetson Nano running MobileNet. Through the use of NVIDIA's TensorRT to accelerate inferencing, we were able to run this object detection model at a whopping 24 FPS with just about 12 W of power. Communication with the 2 Neosensory Buzz feedback devices on the arm were done through Bluetooth Low Energy via the Bleak library and the experimental Neosensory Python SDK. Echolocation distance processing is done through an Adafruit nRF52840 microcontroller connected to an ultrasonic sensor; it relays processed distance data (via Bluetooth Low Energy) to a third Neosensory Buzz device placed on the leg.

Challenges we ran into

This was definitely the most challenging to execute project we've made to date (and we've made quite a few). Images have tons of data, and processing, condensing, and packaging this data into an understandable manner through just 2 data streams is a very difficult task. However, by grouping the classes into general categories (for example cars, motorcycles, and trucks were all grouped into motor vehicles) and then sending a corresponding signal for the grouped category, we could condense information into a manner that is more user friendly. Additionally, we included a built-in frame rate limiter, which prevents the user from receiving way too much information too quickly from the Neosensory Buzz devices. This allows the user to far more effectively understand the vibrational data from the feedback devices.

Accomplishments that we're proud of

We think we’ve created a unique solution to sight-assistance for the blind. We’re proud to have presented a fully functional project, especially considering the complexities involved in its design.

What we learned

This was our first time working with the NVIDIA Jetson Nano. We learned a ton about Linux and how to leverage NVIDIA's powerful tools for machine learning (The Jetpack SDK and TensorRT). Additionally, we gained valuable experience with creating brain-machine interfaces and learned how to process and condense data for feeding into the nervous system.

What's next for OptiLink

OptiLink has room for improvement in its external design, user-friendliness, and range of features. The device currently has a learning curve when it comes to understanding all of the patterns; of course, it takes time to properly understand and make sense of new sensory feedback integrated into the nervous system. We could create a mobile application for training pattern recognition. Additionally, we could integrate more data streams in our product to allow for better perception of various vibrational patterns corresponding to specific classes. Physical design elements could also be streamlined and improved. There’s lots of room for improvement, and we’re excited to continue working on this project!

Share this project:

Updates