AudioVision

Inspiration

To provide a navigation the vision impaired. We are interested in sensory perception, replacement, and augmentation.

What it does

AudioVision using a head mounted depth camera and IMU to create a 3D representation of the environment. It then maps the environment back into 3D space using directional audio.

How we built it

The core of our project is the Occipital structure sensor, a depth sensor similar to the Kinect. We use its depth data, combined with gyroscope and accelerometer data, to create a world-space mapping of our environment. We read frames from the camera as fast as possible, while concurrently reading data from the gyroscope. For each frame, we first use an inverse perspective projection to transform the frame into camera space. We use the gyroscope data to then transform this into world-space based on orientation. We store these and perform processing to remove noise, and determine interesting points in the world to place audio. This algorithm is expensive, sometimes only running at 10FPS. To lower latency, we keep past calculated sound points and transform them based on gyroscope data between reading frames. On each point cloud, we down-sample it using a voxel grid, then use a clustering algorithm to remove small areas that may be uninteresting or noise. From the candidate points left, we perform random sampling to select some as audio sources, then create a square wave with frequency based on depth. We composite these points with those of past frames, and use a head-related transfer function to create spatial sound.

Challenges we ran into

We had a lot of problems integrating all of our separate code bases into the final design.

A bug with proxy types combined with type inference in c++ when compiled with release versions was particularly frustrating.

A lot of work went into ensuring that our project was responsive to movement of the wearer. We run many expensive algorithms, but cannot lag behind someone quickly moving their head.

Accomplishments that we're proud of

Point orientation correction using IMU sensor readings. This was a huge milestone for because it meant that our project was actually feasible.

Using pitch to help augment our ability to detect how far away objects are, instead of just the gain of the sounds. This greatly improved the usability of our project.

The structure of the code was well thought out at the start of the project, allowing each team member to be able to work on their specific abstraction throughout the entire process.

What we learned

Having a plan for integration from the start is important. We also learned several technical skills such as openAL, clustering, voxel grids, and c++ in general

What's next for AudioVision

We would like to add the ability to read text via OCR, generate a text to speech audio clip, and place that audio clip in the 3d environment at the position of the text. This would allow the user to read text in the environment that isn't braille, or is too far away for a touch based text system. Examples include street signs, bill boards, building labels, and addresses

Built With

arduino
boost
c++
eigen
matlab
occipital-sensor
openal
pcl
visual-studio

Submitted to

MadHacks Heartbleed
- Winner Best BME Hack
- Winner [RUNNER UP] "Objectively Human" - Liberty Mutual

Created by

I worked on turning the point cloud information into sound points in space including pitch, gain and locations. I also kept track of the lifetime of the sounds by fading them out and removing them after a brief period.

Joe Bush
I worked on embedded software that interfaces with the IMU (used sensor fusion to combine accelerometer and gyroscope data), which involved sensor processing in MATLAB. I also implemented the interface between the embedded software and the main application using C++ and boost.

EricMiddleton1
I worked on using openAL to play sounds as though they were coming from different sources relative to the listeners head. I made a c++ interface that wraps the specific openal functionality we needed for our project.

Dustin Ryan - Roepsch
I worked on point cloud processing and manipulation of the project.

Nick Gerleman
Software Engineering Intern @ Microsoft