Media Hero

Inspiration

This project was inspired by the fact that we have no idea what is really going on in this world because the news around us constantly perpetuates propoganda. They spin narratives that will bring them the most views regardless of how much damage they cause to society. We noticed that the most harmful and often most fake news sources didn't have many facts or logical arguements so they were capitalizing on the public's emotions. They relied on extremist language to to convince the public that they were the only right ones.

What it does

Media Hero exists to remove personal bias and emotional factors we might have when sifting through news. This program calculates how often a news source uses political buzzwords or emotionally charged extremist language and then lowers the credibility of that source accordingly. We referenced many studies that explained how media with the certain buzzwords (that we have in our database) is statistically much more likely to be fake. Media Hero asks the user to input one known Democratic news source and one known Republican news source to do the buzzword credibility check on them. After that, it compares the content similarity between the two to see how consistent they are with eachother (if they are not similar, it means both are probably lying to us). That way the user has an idea where the endpoints lie on a scale of bias-truth-bias. Then the program asks for the user to input any other news sources they want check in comparison. Media Hero evaluates if the user's other news sources have more content similarity with the more credible news source or with the less credible news source (which we established based on round one: buzzword test). This gives the user an idea of relatively how reliable their own sources are.

How we built it

We used spaCY, an API which calculates text content similarity, and BeautifulSoup as a web scraping tool. With Beautiful Soup, we were able to take the text off news pages and write a code to check through articles for buzzwords for the first part of the credibility test. For the second part, we ran the scraped texts through spaCY so the API could check them against eachother for content similarity. We imported these applications to Google Colab where we wrote the code. The idea was to first take two baseline news sources that we knew were clearly biased towards one political party and then find the truth in the middle.

Challenges we ran into

One of the challenges we ran into was the realization that we cannot count on both the baseline sources to be equally biased towards their own political parties. It could be entirely possible, for example, that the Democratic source is telling the truth and that the Republican source is completely lying. If we were to assume the truth was in the middle and choose to read a source that has the average amount of buzzwords between the two baselines, we would still end up with a biased source. Considering this, we adapted our plan to add a secondary content similarity test to ensure that even if our variable source has a low amount of buzzwords, if it is more similar to the biased source, then it is also probably not the best read.

Accomplishments that we're proud of

We are proud that we were able to come up with something that we consider a finished product in our very first hackathon ever. Another thing is that after we finished, we searched online if people have done similar ideas and found that nobody has tried this avenue. While it could mean that it is ineffecient or incorrect, in our trial runs, the code worked as we expected so it makes us feel extremely proud. We are also extremely proud of what we were able to learn in this experience.

What we learned

Neither of us knew how HTML works and we learned the basics so we could extract text from webpages. We also learned how to use eachother's strengths to our advantage. Naman is very good at coming up with logical solutions and algorithms while I, Navya, am better at implementing code. Despite our strengths, we felt that we were falling short in certain areas (understanding web pages and HTML) so we both learned how to research effectively. That way we were able to learn completely new concepts and implement them into our project in time. We also learned not to panic and keep cool when things go wrong. Approximately an hour ago, our code crashed and would not accept any website urls. Though we were frightened initially, we realized that anxiety wouldn't fix our project so we split up the code and ran it through an online python stepper website and successfully figured out where the fatal error was.

What's next for Media Hero

In the future, we hope to gain experience with artificial intelligence and machine learning so we can add these aspects to Media Hero. Right now, this program has a very small database of buzzwords and confirmed politically biased media sources. We plan to train an AI to recognize more emotionally manipulative words and expand the "buzzwords list" to phrases and even writing styles. We also plan to upgrade the similarity checker so it automatically checks similarity to a huge variety of resources and takes the average as a credibility score for the endpoints.

Built With

beautiful-soup
google-colab
pycharm
python
requests
spacy

Updates

Navya Mishra started this project — Jan 16, 2022 03:58 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.