For our project, Curiosity, we created a fully extendible object identification framework and application for iOS. Curiosity enhances its users' interactions with many objects by overlaying information that is both relevant and interesting. Currently, Curiosity can detect logos of many different companies and display information such as the headquarters location along with data about its founders. Furthermore, it can detect soda cans and present the nutritional information for the user to see. Our application has a very streamlined learning process and it can be easily extended to detect and augment many more objects than we already considered.
Curiosity utilizes a slew of machine learning and computer vision techniques, but before we are able to analyze the data we collect and store it into MongoDB. We first collect a list of publicly trading stock companies by scraping InvestorGuide. We then added all of PennApps sponsors that were not already on this list. For each company we generated a training set of images by parsing Google Image sites. We performed a similar process when creating a database of soda can images. This information is then passed to the training step of our machine learning pipeline. We extract SIFT-like features using OpenCV. Then Curiosity trains a multiclass Support Vector Machine, or SVM, to classify the other objects it sees. To enhance the classification and object recognition our application drops features that are statistical outliers across multiple different images.
On the user's end, they are presented with a clean user interface that prompts them to either choose an existing image or use their camera in real time. The user then draws a circle around the area they are curious about, the region of interest is sent to a more powerful server to be analyzed, and finally the information is propagated back to the iPhone and is overlaid for ease of access. The user can scroll through the information and if the object they scanned was a soda can, then it also presents them with the nutritional information.
In general, Curiosity provides the users with information very quickly and readily. Although Curiosity currently only supports analyzing company logos and soda cans, it is important to note that it can very easily be extended to analyzing many more objects that are significantly more complicated. For example, Curiosity could conceivably be made to provide information about different species of plants just by looking at them or it could be used on remote autonomous robots to provide qualitative information about it's surroundings without streaming the full video data.
Log in or sign up for Devpost to join the conversation.