Google Review Flagger

Inspiration

This is my first hackathon and I wanted to push myself and venture into an area I was not familiar with. The problem statement was quite an interesting and relatable problem.

What I Learned

More data isn’t always better: naïve pseudo-labeling actually collapsed performance.
Hard prompting with LLMs gives structured, reliable outputs.
Robust evaluation matters — small test sets can distort metrics if not handled carefully.

How I Built It

Baseline: Random Forest + TF-IDF text features + metadata (reviewer history, Local Guide, timestamps).
Enhanced: Added pseudo-labels using Llama 2 (via Ollama API), with structured outputs
Dual Validation: Cross-checks uncertain ML predictions with an LLM; asymmetric fusion rules prefer catching fakes while protecting precision.

Challenges

Pseudo-labeling collapse (all predictions became “legitimate”).
Designing synthetic fakes that look realistic.
Managing latency trade-offs when involving an LLM.

Takeaway

Even when pseudo-labeling failed, Dual Validation acted as a safety net — proving that combining ML with LLM reasoning can make review platforms more robust and trustworthy.

Built With

numpy
ollama
pandas
python
randomforestclassifier
scikit-learn

Updates

Jaris Chow started this project — Aug 30, 2025 07:01 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.