Inspiration

With the US Presidential Debate being fresh in our minds, we truly realized the magnitude of the pandemic of misinformation. With so many false claims being thrown around, we found it most important to call the issue out from its roots. We recently read an article about the state of modern journalism, stating that the most effective piece of news is actually local news networks. However, these networks are often dominated by large corporations, often causing skewed truths, intense biases, and the rapid spread of misinformation. We knew that thorough research performed by intelligent software would be the solution to such a major problem in today’s society.

What it does

Verifai is the solution to the rapidly increasing spread of misinformation in digital media. By simply pasting a YouTube video link, which scrapes the entire transcript of the video. This is then annotated with key details about each major “chunk” of the video. Verifai will indicate whether the chunk is a fact, fake, or if there is not enough information to come to a conclusion. Additionally, the tool also gives a brief description about the chunk in order to provide context or to criticize the validity of the speaker. Verifai also gives up to 3 sources in MLA format to prove the credibility of the claim.

How we built it

We started off by deciding our tech stack.For the frontend we chose React and for the backend we chose Flask.The choice of the stack was simple.React is excellent for building front-end interfaces and our backend needed something in Python because we planned upon using the Cohere API which is highly documented with community support for its Python API.We leveraged the youtube’s transcript api to generate the transcripts for the given video on Youtube.Then we feed the transcripts as text corpus to cohere’s chat api endpoint to classify a given text corpus as factual,non-factual or not having enough information.We leveraged the web connector to generate citations as well.

Challenges we ran into

Our initial tech stack was solely based on AWS infrastructure leveraging Sagemaker,CodeWhisper,AWS transcribe and Cohere’s API.But we did not get any credits for the hackathon.So we had to make do with what we had.We simply used Youtube’s auto-generated transcripts and then classify the text.We also had the idea of using next.js as our stack and expose the deployed model on AWS as an API.But those plans were let go off and we settled upon React and Flask.

Accomplishments that we're proud of

A working project :) for starters. Going through the learning curve of getting unstructured text data from the video and converting it into useful data for the cohere model by some unconventional data preprocessing methods. Lastly, we feel this can be a starter project of something big in the field of fact checking social media information in real time.

What we learned

Diving deep into the world of misinformation, we learned that the situation needs more attention as more and more users become active on social media. Overall, our group learned a lot of new technical skills in the entire tech stack, ranging from frontend to backend. Since everyone had their own choice of tools, it was often difficult to merge our respective parts.

What's next for Verifai

We want to allow users to fact-check Instagram reels.We plan to build an extension for that.What we have is a proof-of-concept working but our plan is to work on the AWS infrastructure and create a industry ready solution.

Built With

Share this project:

Updates