CleanTheBurgh - Detecting fake news in English and Italian
Inspired by JP Morgan's challenge and Botometer which detects twitter bots, we created a similar tool that detects fake news, in both English and Italian.
What it does
When user enter a link to online news, the programme will output the probability that it believes the real.
How I built it
- We first collected and parsed a large amount of data, including title, content and urls of fake news and real news.
- Then the data is fed to our Machine Learning model to train. In the end we have chosen and tuned a RandomForest Model for prediction.
- Also we logged down websites that usually post fake news in both English and Italian. This is used for cross-check.
- For users, we created a web scraper which returns the news content when a link is provided.
- Lastly, we have a GUI for the tool.
Challenges I ran into
- Insufficient data for Deep Learning - used Random Forest instead, a supervised machine learning model
- Some bugs with GUI not displaying image - fixed
Accomplishments that I'm proud of
- The tool works for both English and Italian.
- The tool is working correctly - the accuracy on test data can go up to 80%
What I learned
We have gone through data collection, ML, web scraper and GUI.
What's next for CleanTheBurgh
- Detection ability on fake pictures or pictures with wrong titles/captions
- Ideally the news database and fake news domains need to be updated regularly in order to keep accuracy.