Drone Vision

300 drone photo compilation into a hi definition 3d model

Inspiration

Google Maps is selective with their 3D mapping and modelling regions, we wanted something that is decentralized and capable of being produced by anyone.

What it does

Drone Vision is a data pipeline which allows anyone to create 3D modeled maps and use machine learning with computer vision to enable fast, automatic metadata image tagging.

How we built it

Using Python and Unreal Engine, drone photos are programmatically combined and re-rendered into a single 3D world model. Photos are fed into AI powered image recognition software and labelled accordingly onto the model. Using Unreal Engine, the model is used as the surroundings for a virtual reality touring experience

Challenges we ran into

Due to the nature of 3D model generation, hardware limitations cause rendering and image processing to take a significant amount of time. Furthermore, since data formats and dimensional differences are numerous throughout the lengthy conversion and piping process, connecting each piece of the puzzle was no small task. To properly sample and mesh together hundreds of images, hard drive space became an issue. Additionally, IBM Watson is unable to detect object boundaries by itself. We had to use OpenCV to find out where things are and then feed image partitions into IBM Watson individually to get information.

Accomplishments that we're proud of

In less than 24 hours and several hundred photos, we were able completely construct a high definition 3D model of half of the entire uWaterloo campus. Though we did not have the hardware to port full resolution to Unreal Engine, we still achieved a completed, explorable model for use in VR.

What we learned

Accounting for the unexpected lengthy amount of time it takes to render 3d models and meshes. Next time, be more prepared for huge file sizes, extreme ram usage, and gpu stress