What it does
Our project were constructed based on the dataset from Rohde & Schwarz.
The 'analyze.py' uploads some images into Microsoft Computer Vision API, the API returns text descriptions about the images. If we input a series of frames into this API, probably we can get a story returned back.
The 'detect_logos.py' detects all the logos appearing in the frame, the boundary of the logo is draw in light blue dots, as well as orange bounding box.
How we built it
The detection is done by extracting the HOG feature from a series of frames, and then using the DBSCAN clustering to get the edges of the interested logos.
For the classification task, we used convolutional neural networks to train the classifier. Thanks to Microsoft, we have the great computational power from 24 GPU to run this neural network on their virtual machines.
What we learned
Thanks to Rohde & Schwarz, we have plenty of data to explore, to deploy the interesting API of Machine Learning. Thanks to Microsoft Azure, we can use almost all the algorithms and tools of traditional machine learning we know about immediately, and we had the opportunity to try out many interesting APIs implemented by Azure. With the virtual machine, we built our infrastructures, where we trained our convolutional neural networks for classifying the logos.
What's next for PixelBlinder
There are a lot of extensions that we could do further. Such as, based on the text description, we can use AI to organize the descriptions, which are collected from several frames in a program into an abstract of this program. For example, there are descriptions of 6 neighbor frames, these messages might be used to generate an abstract of these 6 frames like “a car accident happened on a high way in a snowy day.”
Log in or sign up for Devpost to join the conversation.