The current political polarization in the United States is often attributed to the dissemination of fake news online. And given the recent Capital insurrection, the task of detecting such has never been more important. Manually checking all statistics and references is improbable; we will be simply analyzing the language used within said article to determine whether or not it is fake is somewhat more realistic.
What it does
The website simply takes in article text and predicts whether or not such is fake.
How we built it
The development was pretty straightforward and centered on two things, creating the model and then applying such on user input from the website. Our model started with cleaning the 6335 values from our dataset and then extracting the term frequency–inverse document frequency of each. After vectorization, we simply fed the data into two models which both had an accuracy of a little over 90% . The second half of the project was simply saving the better model with pickle and feeding input from the django website into such.
Challenges we ran into
Accomplishments that we're proud of
Our Logistic Regression and Passive Aggressive Classifiers had a 91.54% and 93.18% accuracy respectively.
What we learned
This was our first time hearing about the Passive Aggressive Classifier and its really cool that you can quantify how hostile a statement is.
What's next for Fake News
Improving the model and application of such.