With the cancellation of HackTech, first COVID-19 case in Santa Cruz and the worldwide hysteria caused by a possible global pandemic, it seemed unnatural for us to develop for anything other than Coronavirus. While we regret no being able to come up with a vaccine this weekend, we built an app to minimize the out of control panic and hysteria that harms more people than it helps.
What it does
This chrome extension will give users more information based on the media it is drawn from. We will be using all media sources to pull information to use in Google's nlp API. This results in keywords which we then use in the News-API to recommend alternative articles and display the predicted reliability score.
How we built it
Our app uses machine learning and data science to fact check various news sources using keywords, authors and sources from the URLs being checked. In order to train a model that accurately recognizes potentially unreliable sources, we used a dataset of 9.5 million URLs (https://github.com/several27/FakeNewsCorpus) that contain a variety of fake, biased and reliable news links.
Challenges we ran into
The dataset we used to train our model to recognize unreliable sources was far too large for the amount of computer power that was available to us. Processing the data in a way that maintains as much thoroughness and accuracy as possible proved to be a lengthy and difficult task.
What we learned
We learned how to use natural language processing for sentiment analysis and classifying text data. We spent a huge amount of time pre processing data for use with a deep learning model.
What's next for Corona Truth Finder
We plan on optimizing our model to be more accurate by checking more of the data in the sources we are classifying as reliable or unreliable. We also want to make our app available on more platforms.