deepstreetview.com

What it does

It turns Google Street View into works of art. Your world will never be the same.

Inspiration

The first iteration of the website was a Chrome Extension: VR Harambe Sighting in which Harambe would be stochastically overlaid on the sidewalks of Google Street View using a suite of CV algorithms. Due to the nature of async flow, not all functionalities can be implemented in browser. As a result, I decided to pivot the app towards deep learning.

The Tech

The frontend is written in vanilla Javascript. It overrides Google Streetview API calls with custom event handlers that redirect AJAX from Google's server to a custom server where tiles are processed on-the-fly. The backend is a Python Flask microservice proxied with Nginx. The heavy-lifting algorithm is abstracted away using torch and system calls. Fast-neural-style was released one month ago and provides drastic improvement over the speed of image rendering. As opposed to a full-fledged Convolutional Neural Network(CNN), it was implemented with simple tensor arithmetic not dissimilar to those commonly found in everyday Computer Vision. Usually, rendering a huge image such as Google Streetview( 3300x2048 ) would take at least a couple of hours. This new algorithm allows on-the-fly rendering: video and server apps such as this one. Special thanks to 1and1 for sponsoring the server which is 4cpu with 16GB memory.

Challenges

  1. I started deepstreetview around 6pm Saturday and finished at 4am Sunday. It would have more features if I woke up earlier.

  2. Google Maps API was a low-level "callback hell". In the age of ES2017, it should be rewritten with the help of async/await and AMD. I want to thank EMCAScript Guru Ron Chen for his guidance with the Google Maps API.

  3. The very first successful build uses the same format as Google Streetview. The tile size is 512x512 so the output looks pieced together with visible gridlines between tiles. This is solved by stitching all the parts and cutting the result in half vertically (rationale being the middle part is usually the road). This makes the image much more integral and easy to look at. However, the downside is that it takes the server much longer (10x) to respond because of the nature of the algorithm.

  4. The (7x4) tiles are fetched from Google in 512x512 so 28 requests have to be made. It was originally written synchronously but then converted to parallel using a simple threading model. This increased the speed from 2s to 210ms.

  5. Computation is expensive so caching is necessary. Although the cache is implemented using /tmp/ and serialized python strings, it works like a charm so not a challenge when n is small.

  6. The server does not have a GPU as 1and1 don't offer GPU instances. This drastically slows process down. With a GPU, responding should be within seconds.

  7. GPU instances are unreliable and expensive. Definitely not worth it for a sensational project such as this one. I think the future of this is to cache enough processed street view images and throw them in a vault.

Accomplishments

  1. Buying a Namecheap domain.

  2. The idea for the hack and the execution were pretty on point

  3. The ML library Torch is easy to deploy. It took about 8 lines of code and works flawlessly.

  4. This being a generally successful hack with lots of hacky code written is good

Lessons learned

  1. It's canonical to respect the API

  2. Patience is the archenemy of programming

  3. Neural nets are hard to devise but bring good vibes when deployed

  4. Keeping up with The Hacker News will get you clever ideas

What's next for DeepStreetView

I have an Ethereum mining rig in my dorm so I'll just use that to cache enough processed street view tiles and throw everything on AWS and forget about this project. And the python flask microservice is simply not robust enough to handle any load so that has to be rewritten.

Built With

Share this project:
×

Updates