Encouraged by MHacks theme for social good, our team sought to tackle the issue of biased news. In order to find 'bias' our program looks more into the words and the grammar usage itself. The most dangerous issue of biased news is the power it holds to allude to fake news or cause mass amounts of unwarranted outrage due to the language the author decides to employ. Inspired by a study conducted by Molly Crockett, a professor of psychology at Yale University, we decided to build off the ideas presented in the study. In the study, Professor Crockett explains how from using so called "outrage words" such as deplorable, reprehensible etc, articles and tweeters alike create massive amounts of user engagement by playing off of chemical responses to arguments. Online, people are not as careful with their words, as there is no direct human response reacting to the words being put down. Our goal was to build a tool that can help people identify articles they can trust in the very opinionated times of today.
What it does
The website gives the user two options: input a URL or text. If a URL is inputted the program will scrape the text from the website and analyze it using Sentiment Analysis which returns two vales, polarity and subjectivity. Using those returned values the percentage of polarity and subjectivity of the chosen article is presented to the user. The input mechanism is very similar with the only difference being that the text is inputted directly from the user to be analyzed.
How we built it
The website is hosted on the Google Cloud Platform for scalability and robustness. For the back end we decided to use Flask (Python) for its simplicity. The analyzation tool mainly uses two Python packages which are Beautiful Soup and TextBlob. Beautiful Soup allowed for website scraping by analyzing the HTML file of a website and sifting through headers and keywords to organize and find article data. TextBlob is the package that has the Sentiment Analysis tool.
Challenges we ran into
Web Scraping can be against the copyright of certain websites in which case the output is not guaranteed. In addition, many website present other forms of text along with the target article on their webpage. Acquiring the correct text proved to be difficult on some websites significantly more than others.
Accomplishments that we're proud of
Having never build a website before, we are proud to have been able to built one using Google Cloud Platform. We are also happy of how well the analysis tool fits in with the website. We also learned how to set up a domain name! Check it out here.
What we learned
What's next for Mhacks2019
We hope to add in a deeper analysis for each article in the future. In addition, we hope to transform Polaras into a hub spot for news browsing, where users can browse articles from various sources. Using our tool, each article would be rated based on its subjectivity.