We were exploring Augmented Reality applications that involved with QR codes or trackers and how to incorporate AR into our daily lives, and particularly into education of younger children.
The app displays a virtual image onto the tracked surface that can be chosen based on the words on the page that are scanned.
We used the Unity framework in order to provide camera access with our android device as well as take photos so they could be saved and used. Using Azure's Cognitive Vision API, we can convert the photo taken into text which can then be processed by Azure's Bing Image Search API; this allows us to parse through the images that appear based on the text being searched and found in a captured image.
One major challenge was compatibility issues with devices with different versions of Android as well as the use of Unity which didn't let us handle exceptions well. We struggled with the debugging process, which we had to route from java, due to Unity's lack of handling.
We were able to get AR to run and display images on the tracker and also managed calls to both of Azure's APIs and then parse the json files that we got back for the information we needed.
We learned that things that sound interesting to do are never easy to implement. Different platforms don't always play nice with each other and can cause trouble with the final integration of a client-side interface and back-end communications.
We created a prototype that doesn't actually read or track text from a captured image, but does display pictures on tracked surfaces that can be assigned to a associated screenshot.
Log in or sign up for Devpost to join the conversation.