Inspiration

Online reviews influence almost every decision we make — from choosing a restaurant to booking a hotel. Yet many reviews are spammy, misleading, or irrelevant, which hurts both users and businesses. We were inspired to build a system that restores trust in location reviews by filtering out the noise and surfacing only genuine, policy-compliant feedback.

What it does

ReviewShield is an AI-powered moderation system that:

  • Detects spam, advertisements, and irrelevant content in location reviews.
  • Assesses whether reviews are genuinely relevant to the place being reviewed.
  • Enforces platform policies automatically using a hybrid of rule-based checks, ML classifiers, and large language models (LLMs).
  • Outputs clean, trustworthy reviews that help users make better decisions, ensure fair representation for businesses, and reduce moderation workload for platforms.

How we built it

We followed a staged workflow:

  1. Data Cleaning & Preprocessing – removed duplicates, standardized formats, cleaned text, and handled missing values.
  2. Exploratory Data Analysis (EDA) – studied distributions of ratings, review lengths, sentiment, and spam patterns.
  3. Classical ML Models – trained Logistic Regression using TF-IDF features for baseline classification.
  4. Deep Learning – fine-tuned a multi-task BERT model for quality and relevance classification.
  5. Policy Enforcement – integrated the OpenAI API to detect nuanced violations that classical models struggled with.
  6. Evaluation – measured accuracy, precision, recall, and F1-score to validate performance.

Challenges we ran into

  • Subjectivity in labels: Some reviews are borderline cases (e.g., constructive complaints vs. rants).
  • Class imbalance: Genuine reviews far outnumber spammy or irrelevant ones.
  • Tradeoff between accuracy and interpretability: Logistic regression was explainable, but BERT achieved better results.
  • Latency with LLM calls: GPT-based policy enforcement added runtime complexity.
  • Defining vague policies: Translating broad rules (e.g., “no rants”) into enforceable logic required combining ML + rule-based reasoning.

Accomplishments that we're proud of

  • Built a complete ML/NLP pipeline in a short timeframe.
  • Successfully combined classical ML, transformers, and LLMs into a single moderation system.
  • Used SHAP explainability to interpret predictions and improve transparency.
  • Achieved solid metrics (high precision in spam detection while maintaining recall).
  • Designed a framework that can scale to real-world platforms with minimal human intervention.

What we learned

During this project, we gained hands-on experience with building end-to-end machine learning pipelines for text moderation. We learned how multi-task transformer models like BERT can significantly outperform simpler baselines while also realizing the importance of interpretability through methods like SHAP. Integrating LLMs for policy enforcement showed us how subtle violations, such as off-topic rants or disguised advertisements, can be better detected with context-aware reasoning. Beyond the technical aspects, we also learned about the need to balance fairness, accuracy, and efficiency when designing real-world moderation systems, as well as how to collaborate effectively as a team by dividing responsibilities across preprocessing, modeling, explainability, and LLM integration.

What's next for ReviewShield

Looking ahead, we plan to extend ReviewShield into a production-ready API that platforms can integrate directly into their review pipelines. We want to incorporate active learning so the system can improve continuously from moderator feedback, as well as expand into multilingual support to handle reviews from diverse global audiences. Another priority is optimizing real-time inference, particularly by reducing the latency introduced by LLM-based policy enforcement. Finally, we envision building dashboards and visualization tools that provide businesses and platforms with clear insights into flagged reviews, helping them monitor and maintain trust at scale.

Built With

Share this project:

Updates