Track 1: Filtering the Noise: ML for Trustworthy Location Reviews

Problem Statement

This project develops a comprehensive machine learning pipeline designed to assess the trustworthiness of location-based Google reviews.

The core problem addressed is the proliferation of unreliable online reviews. For location-based services and businesses, authentic customer feedback is invaluable. However, the presence of fake reviews, spam, and overly generic or uninformative content diminishes the value of review platforms.

This project aims to solve this by creating a robust system that can programmatically evaluate the trustworthiness of a review based on its content and context.

Development Tools

Development Tool: Jupyter Notebooks + Visual Studio Code
Version Control: Git and GitHub

APIs, Libraries, and Frameworks

APIs:

OpenAI GPT-4o-mini: Utilized for automated labeling of the Google review dataset through prompt engineering, providing a scalable and intelligent way to generate a labeled training set based on a predefined policy.

Libraries and Frameworks:

Pandas, NumPy, Scikit-learn, XGBoost, Tensorflow, PyTorch, NLTK, Gensim, Imbalanced-learn

Pre-Trained Models Utilized:

Gensim.models: LdaModel
Sentence-Transformer: all-MiniLM-L6-v2
HuggingFace: CardiffNLP twitter-roberta-sentiment
Unitaryai: Detoxify

Assets and Datasets

Google Local Review Dataset link
Hawaii & Mississippi for Model Training, South Dakota for Use Case Demo
GPT-4o-mini Labeled Data

How the Solution Addresses the Problem

The solution is based on a seven-policy framework.

Rule-Based Filters: Automatically flag reviews violating explicit criteria (Policies C, D1, F).
Feature Extraction: For the remaining policies (A, B, D2, E, G), extract scores using pre-trained models (HuggingFace transformers, Detoxify).
Classification Model: Combine features and input into a Multi-Layer Perceptron (MLP) to classify reviews as trustworthy (1) or untrustworthy (0).

This layered design blends deterministic rules with learned representations, ensuring both precision and adaptability in detecting low-quality or deceptive reviews.

Policy Framework

1. Content Relevance and Quality

Policy A - Business category/name & review content alignment:
Semantic correlation between review text and business info (name, category, description) is measured using LDA + cosine similarity. Low similarity suggests an off-topic, untrustworthy review.
Policy B - Overly generic sentiment with no substance:
Reviews like “great place” or “bad service” are detected using a specificity score:
[ S = w₁·L + w₂·N + w₃·T + w₄·R ]
Where S = final specificity score, L, N, T, R = normalized features, and w₁, w₂, w₃, w₄ = weights.
Policy C - Minimum content length:
Reviews < 20 characters are flagged as too short to be meaningful.
Policy D - Nonsensical content:
- D1 (Non-alphanumeric): High ratio of symbols/non-alphanumeric characters → flagged.
- D2 (No meaning): Detect incoherent/gibberish reviews using sentence-transformers.

2. Review Cohesion and Intent

Policy E - Do ratings match sentiment of text review:
Sentiment analysis (HuggingFace) compares text sentiment with star rating. Large mismatches → flagged as untrustworthy.
Policy F - Advertisement-like review filter:
Regex rules to detect promotional content (links, emails, phone numbers).

3. Prohibited Content and Unsafe Material

Policy G - Excessive use of profanity/hate speech:
Toxicity detection (Detoxify) flags reviews containing offensive, hateful, or discriminatory language.