Inspiration

Fake news is a hot topic on the news and has become an emerging issue as the use in social media has increased. With the recent US election drama and as well as the pandemic, we have seen several fake news articles circulated around social media. Current fake news detection methods are ineffective, complex, and not scalable so we decided to create one based off of machine learning.

What it does

We built a web service where users can input a title or the text of an article, and it will spit out a prediction for how likely it thinks the article in question is fake.

How we built it

We used Spacy to perform the natural language processing and we used Tensorflow to implement an LSTM neural network with two layers using the following datasets:

The back and front-end were used using Flask.

Challenges we ran into

We ran into a lot of memory issues while training. To feed into the neural network, we had to encode the articles with a matrix with the same size. The longest article had over 9000 words, with each word encoded by a vector with 300 bytes, and there were 40,000 articles (corresponding to >100 GB).

To solve this, we noticed that 95% of the articles have a length of under <1000 words so we truncated each article to only have 1000 words.

We also wanted to host it on Heroku, but importing tensorflow and the NLP libraries we used to vectorize the words took up too much space, so we were only able to host it locally.

Accomplishments that we're proud of

This was the first time we took on a serious natural language processing (NLP) challenge and we were surprised how well it worked!

What we learned

We learned a lot about NLP pre-processing and how to deal with memory issues. This was a lot of fun and gave us a great introduction to the world of natural language processing

What's next for Detecting Fake News

Compressing the model such that it can be written as a chrome extension, making the tool more accessible. We would also like to train it on more recent data regarding the pandemic as well as news articles from other countries.

Built With

Share this project:

Updates