GIF
Web Application Demo (play me!)
GIF
Web Extension Demo (play me!)

Project Cactus

A cross-platform AI Fake News Detector.

Web Application · GitHub Repository · Report Bugs · Request Features

Inspiration 💡

The trend of taking ivermectin despite its dubious purported health benefits has perplexed and enraged health experts and doctors struggling to control Covid-19. Self-medicating with ivermectin has been widely reported in the U.S., despite medical professionals advising against it. However, problems generally arise when people consume the version of ivermectin that is only fit for animals.

Unfortunately, this "fake news made its way to Singapore", where Vanessa Koh Wan-Ling, her mother's daughter, fell victim to false information, leading to her comatose state.

Venessa's story shook us to the core, as it showed to us, that not only was fake news more prevalent than we thought, its consequences were deadly serious. Furthermore, we were taken aback by how fake news can reinforce people's perceptions and notions of things or ideas, thereby creating an echo chamber, which consequently creates a negative feedback loop. In addition, we were shocked at how fake news can spread like wildfire 🔥.

As a result, our team was inspired to apply our knowledge and skills by combining software and frontend skills with Artificial Intelligence to combat the spread of fake news. We wanted the community to have a second opinion on whether it was fake news before they shared, spread and digest it.

What it does 💪

Project Cactus is a cross-platform web app and extension that allows the community to verify their news with the help of a machine learning model. Moreover, both avenues allow users to flag out fake news as well!

The web app allows users to copy and paste the headline of news articles suspected to be fake news, upon which the user will receive feedback from the model on how confident it is that the article may contain fake news. Our users can then use the app's sharing functionality to share the model predictions to their friends and family on platforms like WhatsApp.

On the other hand, the web extension works automatically on social media platforms such as Twitter. It analyzes posts and articles in the user's feed while providing feedback for any that contain potentially misleading information. If the headline of news articles exceed a certain threshold, the users will receive feedbacks in red to proceed carefully. Otherwise, feedbacks are in green signalling the article looks safe. It encourages the community to be wary of any articles before proceeding so.

In addition, suppose you came across a provocative news article, that comes from a source you're not familar with. Or perhaps your parents have just shared a suspicous news article with you via WhatsApp. Before you share the article, you could choose to send the article to our model, which could tell you if the article is potentially fake news. This will let you make a more informed decision about whether or not to share the article, and you could even send the results of the query to your family, warning them of the misleading article.

And what if you came across a piece of fake news that wasn't flagged? You can simply use our reporting system to report the article. This will help out in future updates to the model, allowing the model to stay abrest of fake news trends.

How we built it 🏗️

Project Cactus was built from the ground up by a group of technology enthusiasts. Each group member was responsible for a certain task to ensure fast development times.

This is a brief overview of how Project Cactus was built:

Data Curation 📚

The model is trained with Open Source Dataset with over 62k data points(And More!). Refer to our Reference page for links to the fake news datasets.

The Machine Learning Model 🤖

Powering our app is a Bidirectional Long Short Term Memory (LSTM) network, built-in Keras. It has been trained using over 62k news articles and makes use of pre-trained GloVe Word Embeddings for word representation.

We also did additional pre-processing on the data, such as removing stop words from input sentences, before tokenizing our inputs.

The model was trained using Google Colab, with the final model, evaluated on 15k news articles (7.5k for validation during training, 7.5k for the independent test set), obtaining 99% accuracy on the independent test set.

Model Deployment & Backend API 💻

The fully-trained model is deployed using Google Cloud AI Platform, running prediction mode accelerated with a Single NVIDIA TESLA T4 GPU.

An additional Node.js Backend API is then hosted with Heroku, to provide accessible API endpoints for web clients.

Web Extension 🧩

The web extension uses JavaScript. It works by injecting a script into social media platforms before it loads, which then analyzes the posts. Does not store any user inputs or posts into a database unless manually reported by the users themselves.

Web App 🕸️

The web app, hosted on Firebase, is powered by Vue.js.

Challenges we ran into 🧱

Saving the model turned out to be rather difficult as the use of pre-trained word embeddings and a custom text pre-processing functions made it difficult to directly save the model. We discovered, after hours of training,that we had to serialize our pre-processing function before we could successfully load our saved model. In addition, the size of our model meant that our compute instance would have a tendency to crash due to a lack of memory.

Accomplishments that we're proud of 🦚

It’s a Trusted and Convenient Tool for Fake News Validation
Minimalistic and Easy Interface
Completed targeted minimum viable product within very short span of time
Multi-platform solution with seamless integration API server

What we learned 🏫

Parsing data from Twitter was initially very confusing as Twitter uses a lazy loader. As such, fetching all the new nodes from the frontend was made hard. In addition, handling JavaScript asynchronous issues was a challenge for us as well as we were beginners.

We also learned how to export a trained Tensorflow Keras model, and deploy it on Google Cloud and Heroku, making use of Google Credentials.

Finally, we learnt to work together, and distribute tasks such as to complete our project within a very short time span.

What's next for [Team 71-Onsen] Project Cactus ⌛

There are several improvements we want to make to our project, which is summarised below.

Extension support for other social media platforms for a wider reach (e.g. Reddit, Facebook and Instagram)
Support for other languages (e.g. Chinese, Tamil and Malay)
Improvements on the AI model
- Improvements on inference speed via weight pruning and model quantization
- Improvements on network architecture for even better predictions
Ability for Cactus to suggest trustworthy sources related to a given fake news article

Built With

firebase
google-cloud-ai-platform
heroku
keras
node.js
tensorflow
vue

Submitted to

MLDA Deep Learning Week 2021 Hackathon
- Winner Best Beginner Hack
EduHacks ($300,000+ in-prizes)

Created by

I am tasked to deploy the Tensorflow model online through Google Cloud Platform. Although there are some hurdles, but I managed to sort it out by deploying a Heroku backend API for routing purposes. This is my first attempt in model deployment and surely a fruitful experience to see the Power of AI in Action!

Wong Zhao Wu
I worked on the web extension that works on various platform such as the one on Twitter. It scans for new mutations in the DOM when on scroll before querying it to the backend. Though it was challenging and hard to understand, it was fun nonetheless.

Chai Pin Zheng
cat says meowww!
I worked on the design and code of the Web App and setting up the heroku server for the Web App and Web extension to interface with. Was also incharge of deployment of the Web App.

Zheng Kai Ong
I worked on the machine learning model that powered our web app. While I had created deep learning models before, this was the first time I had to actually export and deploy one, so I learnt a lot from the experience.

Oh Tien Cheng