Inspiration
You are watching a Crash-course or Khan Academy video with a lot of complex interdependent concepts. Quickly you become confused and you would like to ask a teacher or tutor about the concept.
What it does
IR-Video uses Google's speech API and api.ai NLP machine learning libraries to analyze captions in YouTube videos for their actual content.
When the user submits a video for processing, IR-Video gets the captions.
Finally, the web app allows you to ask a context-based question. it uses the google speech and api.ai to take the question and then the video is also parsed and timestamped, so the words can also be indexed. then the web app answers your question by taking you to the exact moment in the video with the answer.
How I built it
I built IR-Video with Flask, Bootstrap and Handlebars for rendering the frontend. The Google Cloud Speech API is used to get the users question and dialog-flow is used to process the question process. Video processing is done through background tasks using Celery and Redis.
Challenges I ran into
The very first challenge was downloading captions from the video. It turns out Youtube doesn't have a good API for doing this. After looking around, I realized there were dynamic links that could be used to download the XML formatted captions. Then figured out how to parse the caption text for specific words and return the needed timestamp with a lot of help from google.
Python flask web development traps such as: init.py and no module errors.
The web app only takes single string questions. it cant answer what is photosynthesis, only takes in "Photosynthesis"
Accomplishments that I'm proud of
The project works!
What I learned
I learned to use Celery for background processes, and then have an endpoint that could be queried to report state for the background operation.
python venv
flask web forms and templates
What's next for IR-video
Next would be to add a Recurrent neural network (RNN) LSTM model for better answers and improve my question input with api.ai and dialog-flow to take better questions
Log in or sign up for Devpost to join the conversation.