Amazon has a problem with unfairly boosted products with difficult to detect false reviews. Dishonest companies get around Amazon's false review detection systems by using real reviews for other products. This leads to consumers favoring well established brands, reducing competition.

By making the review system more trustworthy, we aim to lower the barrier to entry into the e-commerce market for small businesses and entrepreneurs, increasing consumer choice and reducing prices through market competition.

What it does

Reviews are difficult to classify as real or fake because they use real reviews, however they use completely unrelated products. We identify potentially false reviews by building a machine learning model that classifies reviews to categories of products, which we then compare to the category for the product listing. A "real" review for gloves that is being used to boost the rating for a pair of headphones would be classified as "clothing" and not "electronics".

How We built it

Python , Machine Learning, Webscraping (Selenium / BeautifulSoup), Google Cloud Functions, Google AutoML Natural Language

Dual Model Design:

Custom Model - scikit-learn / TFIDF Vectorizer / Logistic Regression

Google AutoML Natural Language Model

Challenges We ran into

Lack of experience with natural language processing / machine learning, and Google Cloud / MongoDB Training models takes a long time!

Accomplishments that We're proud of

Learned a lot!

What We learned

Machine Learning / NLP, Python, Google Cloud, MongoDB, Preprocessing Datasets with pandas

What's next for FairFilter

Ideal production implementation would be integration by retail platforms such as Amazon.

Share this project: