Inspiration
In the midst of battling a global health pandemic, we are also fighting a wave of “infodemic”, i.e. the spread of incorrect information about the coronavirus. For fact check queries specific to COVID-19, current tools are either rule-based, where the user has to read through a large webpage or produce answers that are unrelated to the search query. In both cases, the search results lead to confusion for the average user who is not able to understand the returned information easily, leading to the spread of misinformation about the disease. In our submission, FixingBad we take a step towards creating a fact-checking tool for COVID-19 that uses machine learning and language processing along with google cloud APIs to answer user queries reliably in a concise fashion.
What it does
Our goal was to build a simplistic website that is operable by the average user and directly validates information about COVID-19 without producing confusing results. The web app consists of a front-page that inputs a COVID-19 related query from the user. This query is then processed that either confirms or denies the fact. In borderline cases, when our results are inconclusive, we return a short list of trusted webpages where the user can find more information.
How it works
The front-end interfaces via Flask with an AI-based backend to produce a truth score for the fact-check query input by the user. The first phase involves processing the user query using tools from Natural Language Processing to produce a standardized representation of the input tokens. We then produce a semantic embedding of this representation using a deep learning model. The second phase involves curating articles relevant to the query from a list of trusted websites, e.g. the World Health Organization, the Center for Disease Control, and the International Vaccine Institute. We use the Google Cloud CustomSearch API to produce these articles and snippets from them. These snippets are again passed through the NLP pipeline to produce a deep embedding. Finally, the third phase compares the question and the article embeddings to compute a similarity score which is used to determine the correctness of the fact.
How we built it
For the first phase, we used open-source NLP libraries like Spacy and NLTK to produce tokenization of the input strings. Once we have the tokenization, we normalize the sentiment of the phrase by looking at the negation tokens. Then we perform text pre-processing to get to a standardized representation of the string. We then obtained the use of the recently popular deep learning model BERT to obtain an embedding of the representation. For the second phase, we tuned the Google Cloud API to obtain articles and snippets from them in suitable preference order. The NLP pipeline had to be augmented to handle non-standard text on websites. Finally, in the third phase, we used a cosine similarity based metric to compare query and article embeddings.
Challenges we ran into
For the first phase, the major challenge was to ensure that the embedding robustly handles sentiment changes in the input. While this is an open problem in the NLP community, we tried to resolve it approximately by pruning the negative tokens in the query. For the second phase, we had to deal with inconsistent formatting in the webpages leading to non-robust API responses (where a slight meaningless change in the input query leads to very different responses) and the resulting noise in the final deep representations. For the final phase, determining a threshold on the similarity metric which distinguishes between the True and False response was challenging due to the large number of ways similar queries can be posed by the user.
Accomplishments we're proud of
Since most fact-checking websites are either rule-based or look at the query at a high level, we found that understanding the low-level details of the queries leads to a large improvement in the quality of results. To verify this, we compared with Google’s very own Fact-Checking API and found that we are able to provide results to the end-user that are much less confusing and much more concise.
In a relatively short time, we were able to use multiple cloud APIs (Google Cloud Custom Search and Fact-Checking), state-of-the-art Natural Language Processing tools (BERT, Spacy), and integrate all components to create an end-to-end web application using Flask.
What we learned
Sleep is for the weak! and How to successfully use unfamiliar software and interfaces like Flask, Bert, Custom Google Cloud API.
What's next for FixingBad
In the future, we plan to make our web app more robust to small changes in the input query that lead to large sentiment changes in what the user wants to ask. We also plan to use a machine learning model that can learn the similarity threshold to distinguish between a “true” and “false” response.
PS: The working website (hosted on a local server) is demoed in the video.
Log in or sign up for Devpost to join the conversation.