Amazon has a problem with unfairly boosted products with difficult to detect false reviews. Dishonest companies get around Amazon's false review detection systems by using real reviews for other products. This leads to consumers favoring well established brands, reducing competition.
By making the review system more trustworthy, we aim to lower the barrier to entry into the e-commerce market for small businesses and entrepreneurs, increasing consumer choice and reducing prices through market competition.
What it does
Reviews are difficult to classify as real or fake because they use real reviews, however they use completely unrelated products. We identify potentially false reviews by building a machine learning model that classifies reviews to categories of products, which we then compare to the category for the product listing. A "real" review for gloves that is being used to boost the rating for a pair of headphones would be classified as "clothing" and not "electronics".
How We built it
Python , Machine Learning, Webscraping (Selenium / BeautifulSoup), Google Cloud Functions, Google AutoML Natural Language
Dual Model Design:
Custom Model - scikit-learn / TFIDF Vectorizer / Logistic Regression
Google AutoML Natural Language Model
Challenges We ran into
Lack of experience with natural language processing / machine learning, and Google Cloud / MongoDB Training models takes a long time!
Accomplishments that We're proud of
Learned a lot!
What We learned
Machine Learning / NLP, Python, Google Cloud, MongoDB, Preprocessing Datasets with pandas
What's next for FairFilter
Ideal production implementation would be integration by retail platforms such as Amazon.