Inspirat We were inspired by the increasing number of manipulated product reviews on online shopping platforms like Amazon and Flipkart. These fake reviews often mislead customers, promoting poor-quality products or harming competitors unfairly. We wanted to build an AI-powered system to detect and flag these reviews automatically — protecting both customers and sellers from misinformation and manipulation.ion
What it does
Our Fake Product Review Detection System:
Analyzes product reviews (text input) and classifies them as genuine or fake
Uses Natural Language Processing (NLP) to understand the structure, tone, and sentiment of reviews
Flags suspicious reviews based on patterns like repeated content, overly generic wording, and unnatural behavior
Provides a probability/confidence score on the likelihood that a review is fake
How we built it
Data Collection
Used open-source datasets like Amazon Product Reviews, Yelp Spam Dataset, and Kaggle fake review datasets
Preprocessed the data by removing noise, stop words, and standardizing the text
Feature Engineering
Extracted textual features (TF-IDF, n-grams)
Added behavioral features: review length, frequency of posts by same user, excessive praise, etc.
Model Development
Tried multiple ML models:
Logistic Regression
Random Forest
Support Vector Machines
Deep Learning with LSTM
Transformer model (BERT) for best language understanding
Challenges we ran into
Finding good labeled data: Real vs. fake reviews aren’t always obvious or available
Ambiguity in review text: Some fake reviews sound very realistic; some genuine ones sound robotic
Imbalanced dataset: Far more real reviews than fake ones — made training harder
Overfitting: Some models performed well on training but poorly on unseen reviews
Interpreting model output: Explaining why a review is fake wasn’t always straightforward
Accomplishments that we're proud of
Achieved ~90% accuracy with our best-performing model (BERT)
Developed a working prototype that can flag suspicious reviews in real time
Built a balanced model using both linguistic and behavioral features
Successfully explained key review traits that are likely to be fake (repetition, exaggerated sentiment, etc.)
What we learned
We learnt, How NLP can be applied to real-world problems like review analysis
Importance of clean, balanced data in ML
How to use transformer models like BERT for text classification tasks
How subtle patterns in language and user behavior can signal dishonesty
Trade-offs between model performance and interpretability
What's next for Fake Product Review Detection system using Aiml
What’s Next Real-time monitoring: Integrate the model with live review systems on e-commerce platforms
Multilingual support: Expand to detect fake reviews in other languages (e.g., Hindi, Spanish)
Explainability: Add tools to explain why a review is marked fake (e.g., via LIME or SHAP)
Reviewer reputation analysis: Study user history to identify fake-review bots or paid reviewers
Collaborate with platforms to use this system for automatic moderation and flagging
Built With
- and-kaggle-fake-review-datasets-preprocessed-the-data-by-removing-noise
- and-standardizing-the-text-feature-engineering-extracted-textual-features-(tf-idf
- bert)
- best
- data-collection-used-open-source-datasets-like-amazon-product-reviews
- deep
- development
- etc.
- excessive-praise
- for
- forest
- frequency-of-posts-by-same-user
- language
- learning
- logistic
- lstm
- machines
- ml
- model
- models:
- multiple
- n-grams)-added-behavioral-features:-review-length
- random
- regression
- stop-words
- support
- transformer
- tried
- vector
- with
- yelp-spam-dataset
Log in or sign up for Devpost to join the conversation.