DeepDepth

Generated Depth (depth displaced with RGB overlay)
Generated Depth (used to displace a 3D mesh)
Generated Depth (initial small model)
RGB Input (Kinect Camera)
Ground Truth (Kinect Camera)
GIF
Initial Model Training over 1000 Epocs

Inspiration

We are both AI and self-driving car researchers at Brown University. We are especially passionate about making autonomous cars and robots more accessible through applying intelligent software to "dumb" and inexpensive sensors

What it does

DeepDepth is a deep learning algorithm that can turn any 2D image into a 3D scene by predicting depth information from standard camera data.

How we built it

We used the NYU Depth V2 dataset which consists of video frames of various rooms captured using the Kinect. It includes RGB data, Depth data, and accelerometer data. We didn't use the accelerometer data for this project.

To learn the depth-prediction task, we constructed our own architecture for a fully-convolutional network using Baidu's PaddlePaddle deep learning framework.

Challenges we ran into

Neither of us had worked with PaddlePaddle before so there was a pretty step learning curve to get acquainted with a new deep learning framework. We also ran into a lot of issue with image formats and pixel values when working with the NYU V2 Depth dataset. Lastly as with all things deep learning, we spent a significant amount of time tuning our architecture and its various parameters.

Accomplishments that we're proud of

We are proud to have been able to get something up and running in a completely new framework in under 24 hours. Its always amazing when you finally see the training error start to drop.

What we learned

We learned how to use PaddlePaddle to construct deep learning architecture as well as how to deploy it to an AWS cluster. Also far too much about the numerous image format that are in existence.

What's next for DeepDepth

Given more time we would love to have a live demo that works with the web camera in your computer. We believe that the ability to derive depth information from 2D image data would have numerous industrial applications from self-driving cars to 3D movies.