We thought about common problems people with disabilities have and came to the conclusion that navigation must be a very difficult topic - especially in crowded/obstacle-rich environments.
With this image in mind we wanted to write code and engineer a solution to help visually-impaired people overcome obstacles using a quite experimental approach - we want to visualize a three dimensional space using audio and sounds.
What it does
Classify the type and distance of objects, obstacles and entities in the environment of the user and classify them to rate their "hazardousness" (e.g. a vehicle) and thereby deciding whether to warn the user using an audio signal (seemingly) coming from that location and direction.
How we built it
We used YOLOv3 and enhanced it for our use case and fed it with data from our Kinect with a custom firmware (based on freenect).
The ML and depth-sensing algorithms/scripts are written in Python. The audio backend is written in C# and interfaces the Unity SDK. Intercommunication of the modules is accomplished with HTTP/stdin.
Challenges we ran into
Our largest challenge was the firmware, which we had to customize to get it to compile and flash onto our Kinect. The second challenge was the fitting of the machine learning model and the threshold values, which we had to optimize for our quite specific use case.
Accomplishments that we are proud of
We are specifically proud of the audio backend. We used audio to visualize the three dimensional environment and are able to reliably page the user if an obstacle is potentially dangerous (e.g. a car/bus).
What we learned
We learned a lot about point clusters, object classification using NNs, computer vision and the challenges visually-impaired people face every day.
What's next for LaneWarn
We hope to build a first release version and potentially publish it as open hardware.