FairFilter

Inspiration

Amazon has a problem with unfairly boosted products with difficult to detect false reviews. Dishonest companies get around Amazon's false review detection systems by using real reviews for other products. This leads to consumers favoring well established brands, reducing competition.

By making the review system more trustworthy, we aim to lower the barrier to entry into the e-commerce market for small businesses and entrepreneurs, increasing consumer choice and reducing prices through market competition.

What it does

Reviews are difficult to classify as real or fake because they use real reviews, however they use completely unrelated products. We identify potentially false reviews by building a machine learning model that classifies reviews to categories of products, which we then compare to the category for the product listing. A "real" review for gloves that is being used to boost the rating for a pair of headphones would be classified as "clothing" and not "electronics".

How We built it

Python , Machine Learning, Webscraping (Selenium / BeautifulSoup), Google Cloud Functions, Google AutoML Natural Language

Dual Model Design:

Custom Model - scikit-learn / TFIDF Vectorizer / Logistic Regression

Google AutoML Natural Language Model

Challenges We ran into

Lack of experience with natural language processing / machine learning, and Google Cloud / MongoDB Training models takes a long time!

Accomplishments that We're proud of

Learned a lot!

What We learned

Machine Learning / NLP, Python, Google Cloud, MongoDB, Preprocessing Datasets with pandas

What's next for FairFilter

Ideal production implementation would be integration by retail platforms such as Amazon.

Built With

Submitted to

Hacklytics
- Winner Best Use of ML

Created by

I worked on data scraping using selenium, beautiful soup and python.

Abrahan Nevarez
I worked on the machine learning model, training it and using it to analyze the reviews

Samantha Daugherty
Terrell Ibanez
Human-Computer Interaction Researcher

Updates

Terrell Ibanez started this project — Feb 23, 2020 10:16 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.