The most enriching and interesting video content on the web comes from educational platforms and conferences that post their whole videos online. Although watching these long videos can be considered time well spent, more often than not, those with busy schedules are usually forced to read a quick summary of the happenings rather than truly engaging with the content. Inspired by this problem, we created a tool to shorten these videos, and deliver the essential information that viewers desire.
How it works
We first scrape audio from a broadcasted event and connect phrases and sentences to form larger statements. A sentiment score is assigned to every statement using Google Cloud's Sentiment API. Changes are monitored and moments that stand out - cross a threshold relative to the mean - are filtered into the highlight reel. Auditory insights, such as applause and laughter are also factored into the highlights. Next, Google Cloud's Video Intelligence API is used to analyze changes in setting and speaker in a scene. This data is merged with the audio timestamps to create a smooth, complete video reel.
How we built it
Our backend is built in NodeJS, as an Express server that interacts with the Google Cloud API. YouTubeDL is used to extract close captioning for an almost instant transcription. The server outputs an array of timestamps - which are passed to a PHP script that uses ffmpeg to split and combine the original source into a single highlights reel.
Challenges we ran into
The biggest challenge was making sense out of 3 distinct channels of data - sentiment scores, video decomposition and audio insights. Performance was dependent on parameters such as the statement block size and sentiment threshold - which required tuning. Often, multiple sentiments came up in the same shot - and post processing logic with the video analysis was essential to string together a smooth scene. Cutting up the video up after retrieving timestamps from the server was also non-trivial.
Accomplishments that we're proud of
Having only created mobile hacks prior to this, we are proud that we could pull of a fully functional project that utilized many different tools unrelated to the mobile platform such as scraping, php scripting, server-side protocols.
What we learned
We became comfortable moving around the Google Cloud Platform and integrating the various APIs available in its suite. We also learned how to manipulate certain parameters in our algorithm to optimize the output of the sentimental analysis API.
We hope to be able to gauge real-time social media response to identify the most popular moments at events; possibly tracking a distribution of live comments and trending tweets to create more insightful highlight videos.