EverythingWorks

LLM ReviewGuard

Inspiration

In a world where users rely on Google location reviews to decide where to eat, shop, or get a haircut, the integrity of these reviews has never been more important. Yet, platforms are increasingly flooded with irrelevant, spammy, or even fabricated reviews, many of which undermine user trust and business reputations. Manual moderation can't keep up, and existing heuristics often fall short. Worse, they lack explainability — businesses don’t know why a review was flagged, and users don’t know what qualifies as a policy violation.

That’s why we built LLM ReviewGuard — in response to Problem Statement 1: Evaluating Quality & Relevancy of Google Location Reviews. Our solution harnesses the power of large language models to detect spam, ads, false reviews, and irrelevance, while also scoring the quality and contextual fit of each review.

What it does

LLM ReviewGuard is an automated moderation framework that uses GPT-4o to evaluate reviews of physical business locations. For each review, it performs multi-criteria assessment, identifying policy violations, assigning structured relevance and quality scores, and most importantly, providing a transparent justification for every decision.

Our system doesn’t just flag reviews — it explains them, allowing platforms and businesses to understand not just what was flagged, but why.

Solution Overview

Here’s how LLM ReviewGuard works:

Review + Metadata Contextualization: We begin by merging raw user reviews with location metadata (name, category, address, hours). This provides the LLM with critical context to interpret each review accurately.
Prompting & Moderation Logic: A carefully engineered system prompt guides the LLM to evaluate each review based on four policy categories (Advertisement, Irrelevant, False, Vulgar Language), assign Relevance and Quality scores (Low/Average/High), and return a concise Extraction Justification.
LLM Evaluation: Using the OpenAI GPT-4o model, we send each contextualized prompt and receive back structured JSON outputs for all moderation fields.
Output Structuring: Results are parsed, joined back to the input dataset, and saved into a final file containing both the original review and the moderation labels, and are now ready for validation, visualization, or downstream applications.

How we built it

To bring ReviewGuard to life, we combined powerful AI infrastructure with modular Python tooling:

Backend Processing: pandas, regex, and robust data cleaning logic for real-world review datasets.
LLM Integration: GPT-4o via the OpenAI API, wrapped in a reusable LLMClient.
Prompt Engineering: Designed prompts that balance strict policy enforcement with flexibility to handle ambiguous language.
Validation Pipeline: Ground truth-based evaluation using scikit-learn metrics like Accuracy and Macro-F1 to assess alignment with human moderation.

Our system is fully modular, allowing fast prompt iteration, dataset swapping, and easy extension to other moderation domains.

What’s next for LLM ReviewGuard

LLM ReviewGuard is more than a prototype, it’s a foundation for the future of content moderation.

We envision this approach being deployed across review platforms like Google Maps, Yelp, and TripAdvisor, where high-volume review streams demand both precision and explainability. -> For businesses to monitor service quality & reputation -> For platforms to ensure compliance, fairness, and content integrity -> For LLM trainers to leverage well-labeled, policy-grounded data

Try it here! https://github.com/siyinggg0402/Tiktok-Hackathon