Fake news has been a huge problem for the last couple of years and has often been subject to debate. The consequences of fake news are big: the impact is huge. Therefore, it is a problem that needs to be tackled: It only gets easier spread fake news into the world. At first, the problem of fake news seemed relatively harmless, but as time progresses, it now becomes clear that even political votings are influenced by fake news. The time has come to stand up to fake news.
Before we can stand up to fake news, first we need to identify news itself. News is the representation of events. It always consists of two parts: the facts (who, what, where) and the interpretation (how, why and judgement). The facts can be measured and checked. The interpretation is always up for debate since it is subjective.
Besides identifying what news is, we need to state the parties making news. In general, there would be two different parties: respected news sources and the unknown news sources. The key difference between those two parties is that the journalists of respected news sources are educated and have been taught integrity and ethics.
Trends causing fake news
Now that we know what news is, and who makes news, let’s try to understand why fake news is a thing. From our perspective this is due to 3 trends: 1) everyone can write anything and has an audience 2) news-worthiness is based on juiciness instead of quality of content 3) the importance of news is defined by the amount of readers, not the amount of sources
This combination creates ideal breeding conditions for fake news. But fear not, we have the one solution to the fake news problem:
To prevent the fake news from dominating on social media and in the news, the principle of hallmark is used that shows the reader of an article whether the news article may be trusted or not. There are three different icons that tell the reader at first glance if they may or may not trust this article. ✓ indicates the article may be trusted, X tells the reader the article may not be trusted and ? means that there is not enough data available to say something about the trustworthiness of the article.
To make this possible, the basis of wisdom of the crowd is used, which is based on the idea that a group knows more than an individual. The way this idea is used is that the respected news sources are that masses which together have the right information. So to test the trustworthiness of an article the similarity between this article and trusted articles on the same subject is established.
The Google Cloud Natural Language API is used to identify the key entities in news articles. These entities can be divided into who (person & organization), where (location) and what (event & other). These who, what and where entities of trusted sources are stored. When the validity of an article needs to be established the who, what and where entities are compared with the entities of the trusted news articles on the same topic. When there are a lot of similarities between the entities, the content of the article is more similar to the trusted articles. Because it can be assumed that the content of the trusted articles is valid and we have a measure for the similarity between texts we can classify articles in trusted (similar to the trusted articles on the same subject) and not trusted (not similar to trusted articles on the same subject).
Challenges we ran into
Even though a lot of advances have recently been made in natural language processing, it is still hard to find the subject of an article. This forced us to add the subject manually.
Accomplishments that we're proud of
In the last 48 hours our goals was to make something that will be socially relevant. This is the accomplishment we are the most proud of. We worked on a problem which is causing trouble in the world right now. The problem of fake news is getting bigger everyday, so the urge for a solution becomes stronger too. Therefore, the fact that we really came up with a solution to this, makes us feel incredibly proud. This feeling strengthens by the fact that in our solution we included a technological innovation, which is still very new to the world, and used it to solve a real social problem. Artificial intelligence is still in the beginning phase of exploration and we already used it in this project.
What we learned
During Junction we had lots of fun and learned many useful things. For example, how to stop recording your screen (I know, we should have known this…) or how important sleep can be. But next to these small things, we also discovered that it is very important to have a good concept before you start. In our process, we have first worked and brainstormed for 12 hours before eventually coming up with our current concept.
What's next for Quality Control
Quality Control is currently based on a simple algorithm. In the future this should be extended and improved. We could, for example, us a multiple regression AI to improve the similarity rating of the entities: who, what and where. Quality Control isn’t yet considering when and where an article is written. This should be done to make sure the system can’t be tricked by bots (because bots could automatically write many articles which would result in a higher rating). When the algorithm is sufficiently reliable the next step is to let the big social media companies implement it on their websites.