News or Not

Inspiration

Ads pop out when we are using the Internet. We want to know if that has valuable information for us. That inspires us to do a text classification to determine if one is news or not.

What it does

Our algorithm trains 80% of the Brown corpus (contains 500 texts, news and non-news) for text classification. And test on the rest 20% for accuracy.

How we built it

We implement Naive Bayes Classifier in natural language toolkit to train our designed features

Challenges we ran into

Initial training wasn't good enough. We found out that second-round training based on first-round result improves accuracy.

Accomplishments that we're proud of

It reaches 97% accuracy!

What we learned

Team work rocks!

What's next for News or Not

It has tons of application, like spam filter, author classification...

Built With

natural-language-processing
python

Updates

Shantina Jwu-Hsuan Hwang started this project — Nov 11, 2018 10:55 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.