Image recognition is an amazing tool we have now - but it is quite difficult to come up with a good demonstration of its full power. Image recognition is highly useful for automated tasks such as labeling and tagging images, but in real life, you wouldn't fully appreciate it's usefulness. So I changed the objective from "classifying images" to "verifying" that the image classification the human has made is correct. This is ideal for games, and today, I used this approach to Google Street View, which is an amazing playground for finding unique images and landscapes.
What it does
To win this game, you have to collect a set number of keys in order to open the treasure box. Clues are given to win the keys - these are objects and items that could be found on Google Street View. You take a capture of the Street View every time you come across the item you want to get. Microsoft Computer Vision API is used to identify the object in the Google Street View and gives you the keys.
The keys have three colours - these colours are assigned by Microsoft Custom Vision. Blue keys correspond to snapshots that are taken facing forward (the road continues straight). Yellow corresponds to the road turning / at an angle, and red corresponds to the snapshot taking side street views.
There is a time limit to acquiring the keys to open the treasure box. The progress bar on the left hand side shows the remaining time. New items are added as the game progresses.
How I built it
Challenges I ran into
My initial plan was to have the Custom Vision API classify not merely whether the road is at an angle, but which side it is turning. However, I kept getting very low precision and recall (about 60%) when I tagged left turning and right turning roads. After I combined the two tags as "angled", however, precision and recall rose up to 80%. I am guessing that Custom Vision API automatically generates mirrored dataset to train the model, and that is the reason for the error occurring.
Accomplishments that I'm proud of
Coming up with the concept was the most difficult part, as I was planning this hackathon project as such that can be reused for student workshops and Microsoft blog posts. Although image recognition has an infinite range of use, it usually runs at the background, labeling tones of data without the appreciation of the user. I am proud that this idea makes image recognition an interactive process and enables users to fully appreciate its usage. I believe that this project would be attractive for other students who want to start using Microsoft APIs and has some educational value.
What I learned
I didn't know that you could put Google Street View and Google Maps side by side, and make it interact with each other so easily. Also being able to post image data straight from Google APIs to Microsoft APIs was quite a satisfaction.
What's next for Streasure Hunt
Adding user logins, maintaining user progress and calculating scores.