[DayOne] Filtering The Noise

AI-powered review filtering using transformers to detect spam, ads & irrelevant content. Tested on millions of Google reviews. Cleaner reviews, better decisions.

Comment

Inspiration

Online review platforms are plagued by spam, irrelevant content, and unverified complaints, making it hard for users to trust location reviews. We wanted to build a system that automatically filters out the noise, ensuring only genuine, relevant, and helpful reviews are surfaced.

What it does

[DayOne] Filtering The Noise is a machine learning pipeline that analyzes Google location reviews to:

Detect advertisements, spam, and irrelevant content
Assess review relevance to the business/location
Identify emotional rants and unconstructive feedback
Score reviews for quality and usefulness
Flag or filter reviews that violate platform policies

How we built it

We built a batch-processing pipeline using transformer-based models for spam detection, sentiment analysis, semantic similarity, and image classification. The system processes large datasets, tags reviews with multiple quality metrics, and exports comprehensive results to CSV. We also developed a FastAPI backend for real-time review analysis.

Challenges we ran into

Integrating multiple transformer models efficiently
Handling large-scale data processing and memory management
Designing robust relevance and quality scoring mechanisms
Ensuring the pipeline is extensible and easy to use
Validating performance

Accomplishments that we're proud of

End-to-end ML pipeline with multi-layer analysis
Real-time API for review evaluation
Comprehensive tagging and scoring of reviews
Successfully filtered out spam and irrelevant content from real-world datasets

What we learned

Practical challenges of batch ML processing
Importance of multi-dimensional review analysis
How transformer models can be combined for robust content moderation

What's next for [DayOne] Filtering The Noise

Expand to more regions and languages
Integrate user feedback for continuous improvement
Add more sophisticated image and text analysis layers
Deploy as a scalable cloud service for review platforms

Built With

cockroachdb
fastapi
huggingface
python
pytorch

Updates

Dylan Liew started this project — Aug 30, 2025 10:47 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.