CleanTheBurgh - Detecting fake news in English and Italian

Inspiration

Inspired by JP Morgan's challenge and Botometer which detects twitter bots, we created a similar tool that detects fake news, in both English and Italian.

What it does

When user enter a link to online news, the programme will output the probability that it believes the real.

How I built it

We first collected and parsed a large amount of data, including title, content and urls of fake news and real news.
Then the data is fed to our Machine Learning model to train. In the end we have chosen and tuned a RandomForest Model for prediction.
Also we logged down websites that usually post fake news in both English and Italian. This is used for cross-check.
For users, we created a web scraper which returns the news content when a link is provided.
Lastly, we have a GUI for the tool.

Challenges I ran into

Insufficient data for Deep Learning - used Random Forest instead, a supervised machine learning model
Some bugs with GUI not displaying image - fixed

Accomplishments that I'm proud of

The tool works for both English and Italian.
The tool is working correctly - the accuracy on test data can go up to 80%

What I learned

We have gone through data collection, ML, web scraper and GUI.

What's next for CleanTheBurgh

Detection ability on fake pictures or pictures with wrong titles/captions
Ideally the news database and fake news domains need to be updated regularly in order to keep accuracy.

Built With

apis
data-mining
gui
interface
machine-learning
natural-language-processing
python

Submitted to

Hack the Burgh 2018
- Winner JP Morgan

Created by

Machine learning model
file parser

Pinzhen (Patrick) Chen
AI&CS Undergraduate, ML/NLP/Hardware/Robotics, Hackathon enthusiast.
Efe Sinan Hoplamaz
Kacper Kielak
Leonardo Castorina

Updates

Pinzhen (Patrick) Chen started this project — Mar 11, 2018 07:06 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.