In a politically polarized world, progress stops when opposing thought camps turn the discourse inwards. We want to be able to visually show the topics being discussed by political sides in order to advance the conversation.

What it does

It scrapes the internet for websites classed by political affiliation and returns the most prominent keywords in order to display the main conversation topics.

How we built it

We used a library called Newspaper3k for crawling and scraping articles. Then, we used NLTK for stop words removal and stemming. We used a LDA (Latent Dirichlet Allocation) algorithm to model the topics of each article. Finally, we used D3.js to create the visualization of the results.

Challenges we ran into

The biggest challenge was the topic modeling. We used Latent Dirichlet allocation to determine the topics discussed by the article. The precision can be improved further.

Accomplishments that we're proud of

It was the first time we scraped the internet for data and we learned the basics of D3 javascript library, the most popular library for visualizing data.

What we learned

Political camps take interest in different subjects.

What's next for Political Polarity in the Media

Improving topic accuracy, amassing more data, categorizing data by date to observe political topic progression and trends.

Built With

Share this project: