We started with a modest goal of being able to search through a video for certain keywords. Then we realized how awesome it would be if we could just describe the scene in human words and find those scenes in a video! As it turned out to be successful, we also added the feature to find items/apparel from videos (i.e. movies) and create a shopping experience for user through Ebay API.
What it does
We have developed a webpage where users can either a) upload a video of their own or b) provide a Youtube link. We preprocess the video and enable the following 3 features: a) search for scenes using keywords/sentences spoken in the video b) search for scenes by using real life human descriptions (i.e. Joey drinking milk in the kitchen) and c) search for Ebay products using descriptions/keywords.
How I built it
We utilize state-of-the-art models in object detection, image captioning, audio transcription, text matching and image recognition to make this project possible.
Challenges I ran into
Computational requirements of processing a relatively long video made it challenging for our team since we did not have any access to more powerful hardware. Also, some of our speed of computation were bound to the response speed of external APIs we have used along the way (i.e. Google Cloud Platform, Clarifai)
Accomplishments that I'm proud of
We are proud to have a working demo with all planned features finished!
What I learned
All the team have tried and experimented with some new technologies we have never worked with before. It was super helpful and fun!
What's next for VideoSurfer
We would like to see how judges like our project, then we will go from there...