Inspiration

Experiencing first-hand the outcomes of large-scale events such as presidential elections and COVID-19 has made me realize how much of the information people post on the internet is untrue, and even how many people want to spread falsities on purpose. Especially during COVID when everyone was stuck in their houses, many with only phones to entertain them, the influences of technology are evident, and it is this accessibility and dependency on technology that makes all information, including false information, spread like wildfire all over social media platforms and websites. Even when wondering around a supermarket, we can never tell how many of the labels on the sugary snacks or minty toothpastes are true. It is important for people to extract truth from falsities in order for us to make wise decisions and avoid potential financial, emotional, or other consequences. Thus, I was inspired to create TruthSniffer to help people avoid any negative consequences, and make informed decisions that they won't regret.

What it does

My app TruthSniffer provides the disinformation likelihood of a given statement. The user first provides a statement (e.g. an internet article or a Tweet) with its original wording, as well as a rewrite paraphrasing the original statement. TruthSniffer uses existing Machine Learning/Natural Language Processing APIs to analyze the emotion and toxicity of the original statement, returning a score of 0-1 for each emotion and toxicity. The app then uses a Custom Google Search API and PaLM API to evaluate how reputable the statement is based on the rewrite. The output returns 1 out of 5 scores: Very Likely, Likely, Undetermined, Unlikely, and Very Unlikely. If strong emotion or toxicity is detected in the text, the score is bumped up by one level.

Challenges I ran into

I had to familiarize myself with JavaScript and Node.js, since I had previously been the most comfortable using java, and only had a little experience with JS and Node. In addition, I wanted to be able to display visual charts of the disinformation, emotion, and toxicity scores on my UI to make it more elegant and user-friendly, so I also had to learn Chart.js to implement this feature. Searching for the proper APIs also proved to be more difficult than I thought it would be. I initially wanted to use Komprehend APIs for analyzing emotion and toxicity of the statement, but it often did not return consistent results (e.g. timing out randomly, or occasionally returning errors for the same inputs), and I had to research many other emotion/abuse APIs to use before finding NLPCloud APIs. Also, it was difficult to test newly added sections of code, since I had to find test cases for the article and rewrite that would return specific results and actually run the new code to see if it worked.

What's next for TruthSniffer

Instead of copy and pasting a statement found online, I can also support direct URL link classification, so the user can simply put the link of an article to see if the article is trustworthy or not. Also, my current app sometimes will produce different scores for two rewrites that are very similar ("generate" vs "produce" will give different scores), so I could definitely improve this, possibly by using a custom model, or trying the newer version of PaLM API. I also noticed that the analysis of emotion or toxicity is not always accurate, for example, sometimes if skin color is mentioned the statement is classified as "hate speech", even when it's not. Further research and development of sentiment analysis models can hopefully fix this issue.

Share this project:

Updates