AUDVI - AI Arduino Object Detector for the Visually Impaired

The AUDVI device attached to safety goggles as a proof of concept. In reality, it can be easily attached to any pair of glasses.

Inspiration

The inspiration for AUDVI project stems from the increasing need to create inclusive technologies that cater to the diverse needs of individuals with disabilities. The visually impaired often face significant obstacles in their daily lives, especially when navigating through unfamiliar environments. Traditional navigation aids, such as canes or guide dogs, are useful but can be limited in their ability to detect and classify obstacles accurately.

By using Arduino, combined with AI algorithms, we created a compact, low-power and cost-effective device that can detect objects, identify them, and provide audio cues to the user. This device can help the visually impaired navigate more safely and confidently in their daily lives, reducing their dependence on external assistance and enhancing their autonomy.

Moreover, our project has the potential to inspire further innovation in the field of assistive technology, paving the way for the development of new and more sophisticated tools to support individuals with disabilities. By leveraging the power of AI and open-source hardware platforms like Arduino, we can create a world that is more accessible and inclusive for everyone, regardless of their abilities. Ultimately, AUDVI represents a significant step forward in the journey toward creating a more equitable and inclusive society.

What it does

Our project tells the user what it is seeing in front of it. For the visually impaired, this is a very useful addition to canes and other aids. In order to accomplish this seemingly simple task, we split up the processing into multiple stages, explained below.

How we built it

The first stage is offline pre-processing. By doing as much computation beforehand, we can minimize the energy used while on battery power, boosting AUDVI's battery life. The 3 things we process are the speech samples, the AI model, and the category buckets. First, we trained a MobileNet model and quantized it so that it could inference on our resource-limited microcontroller. We followed this by uploaded the weights to an SD card connected to the microcontroller. Then, we used a Python script and Google Text-To-Speech to generate the audio samples for each possible prediction (e.g a file for "dog" or "mask"). We also transferred these to the SD card. For the final step, we used natural language processing to generate "meta-categories", or buckets, to place items in. Our dataset contained a large variety of categories (such as different types of dogs and cats) so we wanted to condense those into simple labels for just "cat" and "dog". We also incorporated this into the code which runs on the microcontroller.

When AUDVI is on battery power in the real world, it only has to execute 3 simple steps in order to aid the user. First, it captures an image and uses the custom trained MobileNet in order to detect what it is. Then, it looks up the appropriate bucket for that item and retrieves the associated audio file. Finally, it plays the audio file. This happens in a loop. Throughout this process, no internet connection is needed.

Challenges we ran into

The process of building AUDVI was both challenging and rewarding. Initially, we had to research and understand the needs of individuals with visual impairments, their limitations, and existing assistive technologies available in the market. This helped us in designing a device that could cater to their specific requirements.

One of the significant challenges we faced was in developing an AI algorithm that could accurately detect and classify objects in real-time while running on the limited computing resources available on the Arduino board. We had to optimize the algorithm and implement it in a way that allowed for efficient processing and low power consumption. Current standard models such as YOLO v5+ have too many parameters to run on our microcontroller, so we had to use an older, smaller model instead. This came at the cost of detection accuracy.

Another challenge was in developing a compact and user-friendly design for the device. We had to consider factors such as portability, ease of use, and the need for audio cues that were easy to understand and interpret. Eventually, we decided we could simply attach the board to the side of a pair of goggles. This made it very unobtrusive and with a few more supplies we could make it almost invisible.

Testing the device in real-world environments was another critical aspect of the process. We had to ensure that the device could accurately detect and classify objects in a variety of settings and provide audio cues that were helpful to the user. This spurred us to add a lookup table for common miscategorizations and categories.

Team members contributed their skills and expertise in areas such as computer vision, hardware design, and natural language processing. We had to work together, experiment with different approaches, and iterate until we arrived at a design that was effective, efficient, and user-friendly.

Overall, the process of building AUDVI required us to overcome several challenges, but it was a fulfilling experience. We were able to learn new skills, such as AI on Arduinos, apply our knowledge in real-world scenarios, and create a device that could make a positive impact on the lives of individuals with visual impairments.

Accomplishments that we're proud of

Through our dedication and hard work, we have been able to overcome many challenges and develop a system that can accurately identify objects and speak their names in real-time. We are proud that we managed to integrate these separated systems (text to speech, image recognition) with effective use of string manipulation! We have also been able to design and build a robust and reliable hardware system that is affordable and easy to use for blind users. M We are proud that our project has the potential to make a real difference in the lives of blind people by providing them with greater independence and a better understanding of their surroundings.

What we learned

We learned how to program AI models on an Arduino microcontroller. We also learned how to train custom models with TensorFlow as an alternative to use pre-trained models. This gave us a background on the different datasets available, such as Pascal VOC, and COCO. We also learned Google’s Text-To-Speech API so that we could generate the voice prompts.

Also, we learned about current solutions on the market that help blind people and how we could improve on their strengths (high accuracy) and weaknesses (generally their high cost).

What's next for AI Arduino Object Detector for the Visually Impaired

There are several potential avenues for future work on our project that could further improve the device's functionality and accessibility. One possible direction is to explore ways of integrating more advanced AI algorithms and machine learning models into the device to improve its accuracy and ability to detect objects in more complex environments while making sure that the Arduino is powerful enough to infer based on models.

Another area of potential future work is to develop a more intuitive and user-friendly interface for the device. This could involve incorporating voice commands or gestures to control the device, making it even more accessible to individuals with visual impairments.

Currently, the device is very exposed and could easily be damaged. The device could be made more robust and durable for use in different environments, including outdoor settings. This could involve exploring ways to improve its waterproofing or incorporating features such as shock-resistant materials or protective cases.

Making it more protected could be complemented with a more aesthetic design so that the AUDVI looks similar to eyeglasses or sunglasses. However, a challenge we might face with making it more aesthetically pleasing is maintaining a similar price point. We might also face challenges with downsizing the Arduino because it is a bit too bulky to make it aesthetically pleasing in its current state.

Future work could also involve exploring partnerships and collaborations with organizations and communities that serve individuals with visual impairments to distribute and promote the device more widely. This could involve working with community centers, schools, or government agencies to raise awareness of the device's capabilities and benefits and provide training and support for users.

Overall, there are many exciting possibilities for future work on the AUDVI project, and we look forward to continuing to innovate and improve the device to make a positive impact on the lives of individuals with visual impairments.