Fake news is false or misleading information that is presented as true information Fake news can be used to trick readers into believing wrong information, which is harmful and dangerous With the advancement of pretrained language models such as ChatGPT, generating fake news is now easier than ever As such, it is important to be able to detect fake news from true ones

What it does

Using NLP techniques, we have recognized patterns that help us distinguish fake news from true ones These methods range from statistical methods to machine learning methods We have found fake news often have different features compared to true news, even though they mostly cover the same subjects

How we built it

We have used the following NLP techniques for our analysis TF-IDF Topic Modelling Clustering Named Entity Recognition Feature extraction using sentence embeddings and Bag of Words Machine Learning methods such as Random Forest Word Clouds

Challenges we ran into

Our main challenges were the limited resources such as computational power and time Many state-of-the-art NLP models, such as BERT, require GPUs for training and fine-tuning Additionally, training and inference of these model take considerably more time than statistical methods For example, we experimented with getting sentence embeddings from BERT, however each sentence embedding took around 10 seconds on GPU

Accomplishments that we're proud of

extract cool insight about what are the main differences between fake and true news

What we learned

please see the slides

What's next for Fake News Detector

please see the slides

slides are the last google drive link

Share this project: