Inspiration

The project is about improving the credibility and reliability of location reviews for users and businesses. To guide our solution, we drew inspiration from our research to form 2 fronts:

a) Human-Centered Data Cleaning
Similar to how IKEA prioritises immersive and meaningful shopping experiences, we wanted to ensure reviews retained context, authenticity, and usefulness. Just as IKEA curates furniture to fit real-world lifestyles, our pipeline curates reviews to fit real-world decision-making.

b) Semantic Intelligence
Inspired by TikTok’s For You Page and how it dynamically adapts to evolving user preferences, our review pipeline goes beyond static keyword rules. By leveraging semantic similarity models, we ensure reviews remain relevant in context, even as language and trends shift.

What it does

Our system is designed in multiple stages that combine data preprocessing, rule-based filtering, machine learning, and an LLM wrapper to evaluate the quality and authenticity of reviews.

The first stage focuses on cleaning and normalizing the dataset. Using our preprocessing script, we remove blank cells, trim whitespace, fix encoding issues, and eliminate duplicates. We also standardize text by correcting capitalization, ensuring consistent sentence formatting, and adjusting common errors such as converting lowercase “i” to “I.” Alongside this, metadata is validated so that ratings remain within a 1-5 scale and business categories fall into defined buckets such as taste, menu, and atmosphere. This process ensures that our dataset is both consistent and reliable for downstream analysis.

The second stage applies rule-based filters to catch obvious violations quickly and transparently. Reviews containing advertisements, irrelevant off-topic content, spam-like repetition, or rants written by people who never visited the location are automatically flagged. These rules allow us to guarantee interpretability since it is always clear which specific violation triggered the flag.

Once the obvious cases are filtered, the pipeline proceeds to semantic and machine learning evaluation. Here, a similarity model checks whether the content of a review is contextually relevant to the business it is associated with, detecting cases where reviews may be off-topic or misleading. We trained our classifier using a hybrid dataset: pseudo-labeled data to scale the training process and a manually verified set of about 1,100 reviews to improve accuracy and provide a reliable benchmark for evaluation.

Finally, to address the limitations of traditional models, we incorporated a wrapper around OpenAI’s ChatGPT API. This wrapper acts as a secondary layer of judgment, particularly for cases where the ML classifier’s confidence is low. Through carefully engineered prompts and structured JSON outputs, the wrapper categorizes reviews into meaningful classes such as spam, irrelevant, or low quality and provides both a decision and a short rationale. This combination of deterministic rules, learned patterns, and language model reasoning allows our system to balance interpretability with adaptability, while remaining robust across different datasets and domains.

How we built it

Our system is a hybrid review classification pipeline that integrates rule-based filtering with a fine-tuned transformer model (RoBERTa). It is designed to automatically distinguish between valid and invalid user reviews, ensuring high reliability and efficiency.

On the technical side, our solution combines classical rules with modern machine learning. We implemented the pipeline in Python using pandas for dataset handling, ftfy and Unidecode for text normalization, and regex for pattern-based filtering. For semantic analysis, we leverage Hugging Face Transformers with RoBERTa-based Sentence-BERT embeddings in PyTorch to measure review–business relevance, giving us both linguistic depth and contextual accuracy. Development and testing were carried out in reproducible environments such as VSCode.

To ensure interpretability and adaptability, we designed a hybrid approach of rule-based filters and embedding similarity checks. Metadata features such as review length, posting time, and user history further strengthen decision-making, while modular code design keeps the system extensible. This balance of heuristics and deep learning makes the model both reliable and scalable across different domains of online reviews.

Finally, we optimized the system for deployment by applying model compression techniques such as pruning, distillation, and quantization, which reduce memory and computation costs without sacrificing performance. This ensures the pipeline runs efficiently even in resource-constrained environments while maintaining accuracy. Together, these design choices make the solution scalable, interpretable, and deployment-ready.

Challenges we ran into

One of our biggest hurdles was optimizing our machine learning model for review filtering. We spent considerable time balancing accuracy with efficiency, as we needed the model to process reviews quickly while maintaining high precision in identifying relevant content. The training process was particularly challenging - we experimented with multiple architectures and hyperparameter configurations to find the sweet spot between model performance and computational requirements.

UI integration proved to be another significant obstacle. Connecting our backend filtering system with an intuitive, responsive frontend required extensive troubleshooting and coordination between team members. We encountered several compatibility issues when trying to display filtered results in real-time, and had to redesign our data flow multiple times to ensure smooth user interactions.

Additionally, we faced time constraints typical of hackathon environments, forcing us to make quick decisions about feature prioritization and technical trade-offs while maintaining code quality.

Accomplishments that we're proud of

We successfully developed a functional review filtering system that demonstrates real practical value for content discovery. Our final model achieved strong accuracy metrics while maintaining the efficiency needed for real-time filtering, striking the balance we initially set out to achieve.

The team overcame complex technical integration challenges to deliver a cohesive product with a clean, user-friendly interface. Despite the UI integration difficulties, we created an intuitive experience that effectively showcases our filtering capabilities.

We're particularly proud of our collaborative problem-solving approach - when faced with technical roadblocks, we adapted quickly and found creative solutions that kept the project moving forward. The final product represents not just our technical skills, but our ability to work effectively under pressure and deliver a meaningful solution within the hackathon timeframe.

Most importantly, we built something that addresses a real need in content filtering and demonstrates the potential for AI-driven review analysis in improving user experiences.

What we learned

This project gave us the chance to translate classroom knowledge of natural language processing, machine learning, and rule-based systems into a practical solution for moderating online reviews. We deepened our technical expertise by working with HuggingFace pre-trained models, regex-based filtering, and evaluation metrics, while also learning how these approaches complement one another. Beyond the technical aspects, we sharpened our ability to integrate theoretical methods with real-world business needs—balancing precision and recall in order to maintain trustworthy reviews without discarding authentic user feedback.

What's next for ReviewGuard AI

Building on our system’s current ability to analyse both text and multimedia content in reviews, we aim to expand its scope and versatility. Future enhancements could include leveraging more advanced image and video classifiers or integrating additional metadata such as reviewer history and posting patterns to provide richer insights. By continuously incorporating new datasets and model improvements, ReviewGuard AI can maintain cutting-edge accuracy and efficiency in review moderation, ensuring even more comprehensive and reliable detection of spam or inappropriate content.

Built With

  • development-tools:-vscode
  • flask-api
  • git-&-github-apis:-hugging-face-sentence-transformers
  • gradio-interface-frontend:-streamlit
  • html-&-css
  • image-classification
  • openai-chat-completions-api-libraries:-pandas-(data-wrangling)-re-(regex-rules-for-spam/ad-detection)-ftfy
  • plotly
  • plotly-(visualizations-and-eda)-dataset-assets:-google-local-reviews-dataset-+-manually-labeled-samples-machine-learning-&-nlp:-pytorch
  • python
  • seaborn
  • sentence-bert-embeddings
  • sklearn.metrics
  • tensorflow-keras-optimization-&-deployment:-quantization
  • torchscript-jit-compilation
  • trainer-api
  • unidecode-(text-normalization)-scikit-learn-(evaluation-metrics)-sentence-transformers-(semantic-similarity-check)-matplotlib
+ 58 more
Share this project:

Updates