Inspiration
Our inspiration came directly from the hackathon's challenge: the growing problem of untrustworthy online reviews. In an era where a single review can make or break a local business, ensuring the authenticity and relevance of user-generated content is crucial. We were motivated to build a smart, automated solution that could bring back trust and fairness to review platforms.
What it does
Our project, "Filtering the Noise," is a complete data processing pipeline that automatically analyzes and classifies Google Maps reviews based on three core policies. It leverages the powerful, multimodal google/gemma-3-12b-it model to analyze not just the text, but also the presence of images. The system processes a review, combines textual and metadata signals, and outputs a clear, structured JSON classification, which is then saved to a final CSV file.
How we built it
Our journey through this hackathon was a true engineering adventure, defined by overcoming successive technical roadblocks.
Our initial attempt to use the gemma-3-12b-it model via its remote API was thwarted by a persistent and elusive 403 Forbidden authorization error. After extensive debugging, we identified it as a server-side issue beyond our control.
In a strategic pivot, we switched to the meta-llama/Meta-Llama-3-8B-Instruct model's API. This was initially successful and validated our core logic! However, we soon hit a second wall: the 402 Payment Required error, having exhausted the free tier's rate limits after processing only a small portion of the data.
Faced with the limitations of remote APIs, we made our final and most decisive pivot: we brought the model in-house. We decided to implement a local inference pipeline directly within our Google Colab environment. This involved:
- Installing and configuring the
unslothlibrary, a state-of-the-art tool for optimizing local model performance. - Successfully loading the massive
gemma-3-12b-itmodel onto the Colab GPU using 4-bit quantization to manage memory. - Rewriting our core function to handle both text and image data for multimodal analysis.
This final approach was a resounding success, allowing us to process our entire dataset without any API limitations and unlocking the full potential of the Gemma 3 model.
Challenges we ran into
- API Roadblocks: We faced both authorization (
403) and rate-limiting (402) errors from remote APIs, which taught us about the fragility of relying on external services under pressure. - Complex Local Environment Setup: Moving to local inference introduced new challenges, including dependency management (
pip installerrors) and the need to optimize for limited GPU memory. - Multimodal Data Handling: Correctly preprocessing and feeding both text and images into the Gemma 3 model required careful implementation following specific library patterns.
What we learned
- Persistence Pays Off: When faced with external limitations, diving deeper into a more complex but more robust solution (local inference) can lead to a superior outcome.
- The Power of Optimization: Tools like
unslothare critical for making it feasible to run large models in resource-constrained environments like Google Colab. - End-to-End Problem Solving: We didn't just build a model; we built a resilient system, navigating API failures and ultimately architecting a completely new inference backend to achieve our goal.
Built With
- gemma
- gemma-3
- google-colab
- google-drive
- hugging-face-hub
- hugging-face-transformers
- llama-3
- pandas
- python
- pytorch
- scikit-learn
- unsloth
Log in or sign up for Devpost to join the conversation.