We wanted to provide content creators with a tool that gives them a more in-depth understanding of their communities reception of their videos. A simple like and dislike bar doesn't accurately model the views sentiment of a video, and going through thousands of comment manually is incredibly inefficient. In addition, a like/dislike bar only is heavily weighted towards the initial reactions to a video, and looking at the comments in a structured way can give you an idea of how people are reacting to it much later.

What it does

The program is a single page application where a user gives a Youtube URL and number of comments. The backend uses the Youtube API to pull the most recent comments (about 200), parses them and determines semantics using NLTK, and then gives data back to the user that would be useful. We return a overall sentiment analysis, a measure of toxicity (which is the % of comments are negative), a word cloud of the most popular words, and finally the most positive and negative comments on the video. These tend to be highly exaggerated, and therefore humorous comments, and we had many hours of fun just looking at videos.

How we built it

We used the Python textBlob natural language processing library for the sentiment analysis. We used the Youtube API for pulling comment threads on videos The backed of the website is managed with flask. The front end is a HTML web page powered by bootstrap and jinga templating.

Challenges we ran into

  • Learning how to use the YouTube API was difficult, because the example code they gave was broken
  • Figuring out what a content creator was interested in
  • No one on our team was particularly strong with front end development, so we struggled getting the css working in a way that looked good
  • Flask was very finnicky and at one point segfaulted
  • Getting the front end to connect with the backend was tough
  • Getting the word cloud to display properly was incredibly annoying

Accomplishments that we're proud of

  • We are proud of getting all of our analytical tools to work and display data, particularity the word cloud.
  • That we managed to make any product at all
  • That we found a package that would help us get the Youtube comments to work
  • We managed to make our own form of toxicity
  • The top comments are hilariously bad
  • emojis are for some reason supported and show up in the word cloud
  • Baby by Justin Bieber got a redemption arc
  • John Oliver has one angry comment section

What we learned

  • We learned a lot about the nuances of front end development
  • working with APIs
  • Natural Language processing using Textblob and NLTK.
  • hosting using flask

What's next for youtube-comment-analysis

  • Return main ideas of comments (and comment threads)
  • compare the comment statistics of two different videos
  • use the time stamp of comments to visualize the community engagement in the video over time
  • Display most active commentators
  • Which comments are fake

Built With

Share this project: