Sift | Devpost

Inspiration

Online reviews are a big part of how people choose where to eat, shop, or visit. But not all reviews are trustworthy — some are spammy advertisements, off-topic ramblings, inappropriate, or even contain personal information. Businesses can be greatly affected by inauthentic or irrelevant reviews.
We wanted to build a tool that helps businesses and customers cut through the noise, surfacing only authentic, meaningful reviews.

What it does

Sift automatically classifies reviews into categories such as authentic, inappropriate, personal information, advertisement, or off-topic.
By filtering out spammy or irrelevant reviews, Sift ensures that both customers and businesses see genuine feedback that actually matters.

How we built it

Data Collection: We used Apify to scrape Google Maps reviews for real-world input.
Data Processing: We cleaned and structured the data using pandas, handling messy text and missing values.

Modeling: We experimented with multiple approaches:

TF-IDF + topic extraction to uncover common themes.
LLM few-shot prompting with open-source models to classify reviews based on policies we defined.
Regex fine-tuning for edge cases (like detecting emails or phone numbers).
Testing: We created custom reviews with misspellings and tricky cases to test the robustness of our classifier.

Challenges we ran into

Finding good, labeled datasets for spam reviews was tough.
Open-source LLMs often gave inconsistent results depending on prompts, so we had to iterate and experiment on prompt engineering.
Balancing accuracy vs. time constraints — some methods could definitely be further improved, but we had to prioritize what was achievable within the hackathon timeframe.

Accomplishments that we're proud of

Building a working pipeline within the hackathon timeframe.
Combining traditional NLP (TF-IDF) with LLMs for more robust results.
Designing realistic edge cases and showing that our system can handle them when scraped reviews fail to provide flexibility.
Creating a flexible structure where new policies/categories can easily be added later.
Most importantly, learning a lot and gaining hands-on experience on what works and why.

What we learned

Prompt engineering is just as important as the model itself.
Sometimes the simplest fixes (like regex rules) are the most reliable.
How to connect multiple tools (Apify, pandas, scikit-learn, LLMs) into a smooth pipeline.
The importance of timeboxing — knowing when to stop experimenting and focus on delivering.

What's next for Sift

Expanding the classifier to support more nuanced categories (e.g. fake competitor attacks, AI-generated reviews).
Deploying it as a browser extension or API service for businesses.
Adding a dashboard with insights: authentic vs. spam ratio, review trends, etc.
Training on larger labeled datasets for higher accuracy.