300 drone photo compilation into a hi definition 3d model
Google Maps is selective with their 3D mapping and modelling regions, we wanted something that is decentralized and capable of being produced by anyone.
What it does
Drone Vision is a data pipeline which allows anyone to create 3D modeled maps and use machine learning with computer vision to enable fast, automatic metadata image tagging.
How we built it
Using Python and Unreal Engine, drone photos are programmatically combined and re-rendered into a single 3D world model. Photos are fed into AI powered image recognition software and labelled accordingly onto the model. Using Unreal Engine, the model is used as the surroundings for a virtual reality touring experience
Challenges we ran into
Due to the nature of 3D model generation, hardware limitations cause rendering and image processing to take a significant amount of time. Furthermore, since data formats and dimensional differences are numerous throughout the lengthy conversion and piping process, connecting each piece of the puzzle was no small task. To properly sample and mesh together hundreds of images, hard drive space became an issue. Additionally, IBM Watson is unable to detect object boundaries by itself. We had to use OpenCV to find out where things are and then feed image partitions into IBM Watson individually to get information.
Accomplishments that we're proud of
In less than 24 hours and several hundred photos, we were able completely construct a high definition 3D model of half of the entire uWaterloo campus. Though we did not have the hardware to port full resolution to Unreal Engine, we still achieved a completed, explorable model for use in VR.
What we learned
Accounting for the unexpected lengthy amount of time it takes to render 3d models and meshes. Next time, be more prepared for huge file sizes, extreme ram usage, and gpu stress