ElixirReviewer

Review Quality Detection System – Devpost Submission

Project Overview

Our Review Quality Detection System is an AI-powered solution that assesses the quality and relevancy of location-based reviews. Unlike traditional approaches that conflate user satisfaction ratings with review quality, our system provides a sophisticated, rating-independent assessment of review content quality.

Problem Statement & Solution

The Problem Location-based review platforms face a fundamental challenge: distinguishing between review quality and user satisfaction. A 5-star review can contain spam, advertisements, or irrelevant content, while a 1-star review can be well-written, informative, and constructive. Traditional systems often use rating data to determine quality, which is fundamentally flawed.

Our Solution We built a machine-learning system that assesses review quality based purely on text characteristics and policy compliance—completely independent of user ratings. The system achieves 98.6% accuracy using an ensemble approach, demonstrating that review quality and restaurant rating are independent concepts.

Key Features & Functionality

Text Quality Analysis: length, readability, vocabulary diversity, grammar assessment
Policy Compliance: detection of advertisements, spam, irrelevant content, excessive rants
Content Relevance: focus on restaurant experience and dining-related topics
Writing Sophistication: grammar, formatting, and style analysis

Advanced Policy Enforcement

Advertisement Detection: phrases like "buy now", "special offer", contact information, competitor promotion
Spam Detection: phone numbers, emails, URLs, suspicious patterns
Irrelevant Content: politics, sports, weather, entertainment topics
Rant Detection: excessive complaints, repetitive negative language
Quality Standards: minimum length, formatting requirements, vocabulary standards

Real-World Performance

1,100 authentic restaurant reviews tested from Google Maps
833 reviews approved (75.7%) — no policy violations
232 reviews approved with warning (21.1%) — minor violations
34 reviews under review (3.1%) — medium-severity violations
1 review rejected (0.1%) — critical violations

Development Tools Used

VS Code – primary Python development environment
Jupyter Notebooks – data exploration and experimentation
Git – version control and collaboration
Terminal/CLI – script execution and environment management

APIs Used

Google Maps API – dataset collection and location-based review data
NLTK API – natural language processing and text analysis
TextBlob API – sentiment analysis and text processing
Scikit-learn API – machine-learning algorithms and training

Libraries & Frameworks

Hugging Face Transformers (for future enhancements)
PyTorch (for deep-learning experimentation)
Scikit-learn (Random Forest, XGBoost)
Pandas, NumPy
NLTK, TextBlob
XGBoost
Matplotlib, Seaborn, Plotly
Imbalanced-learn

Assets & Datasets

Google Local Reviews Dataset: 1,100 authentic restaurant reviews
Manually Labeled Data: quality assessment labels for training/validation
Image Dataset: 1,103 review images (taste, menu, atmosphere)
Policy Violation Annotations: curated examples
Quality Assessment Ground Truth: expert-validated scores

Technical Architecture

Data Preprocessing Pipeline

Text cleaning and normalization
NLP processing (stopword removal, lemmatization)
Feature extraction (TF-IDF, count vectors, topic modeling)
Policy-violation detection

Machine-Learning Models

Random Forest: baseline, interpretable
XGBoost: gradient boosting, strong performance
Ensemble Model: voting classifier combining the above

Feature Engineering

Textual features: TF-IDF, count vectors, topic modeling
Text quality: length, word count, readability scores
Policy violations: spam/ads/irrelevance signals
Writing sophistication: vocabulary diversity, grammar indicators

Model Performance Results

Model	Accuracy	Precision	Recall	F1 Score	ROC AUC
Ensemble	0.986	0.986	0.986	0.986	0.999
XGBoost	0.973	0.973	0.973	0.973	0.996
Random Forest	0.886	0.889	0.886	0.885	0.958

Real-World Applications

Content Moderation: automatically filter low-quality reviews
Platform Integrity: maintain quality standards
User Experience: ensure relevant, informative content
Business Insights: focus on genuine customer feedback

Project Relevance

This project addresses the challenge of assessing review quality and relevancy in location-based platforms. By separating quality assessment from rating bias, it delivers a more accurate and fair moderation and quality-control system. Relevant for:

Review platforms (Google Maps, Yelp, TripAdvisor)
E-commerce sites with location-based reviews
Restaurant management systems
Content-moderation tools
Quality-assurance systems

Future Enhancements

Multi-language support for global deployment
Real-time processing capabilities
Advanced NLP models (BERT, GPT integration)
User-feedback integration for continuous improvement
API deployment for third-party integration

Project Status: ✅ Production-Ready
Best Model Performance: Ensemble (F1: 0.986, Accuracy: 0.986)
Key Achievement: Proper separation of rating and quality assessment
Team: ElixirHackers

Built With

Updates

Gabriel Tang started this project — Aug 30, 2025 11:23 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.