ShearLuck: Our Journey to Filter Out the Noise
It all started with something simple: frustration. We’d go online to check reviews—looking for a restaurant, a café, or a museum—and instead of helpful insights, we found ads, random rants, or complaints from people who had never even been there. Reviews, which were supposed to build trust, were actually eroding it. And so we asked ourselves: “What if we could build something that sees through the noise? Something that keeps the truth, and filters out the rest?”
Where the spark came from? That question became the seed of ShearLuck. We didn’t begin with code or models—we began with a desire to restore trust in everyday decisions.
What ShearLuck does? ShearLuck is like a truth filter. It looks at a review and instantly knows:
- Is this genuine feedback?
- Is it just an ad?
- Is it irrelevant noise?
- Or is it from someone who never even visited? For businesses, it means fewer fake reviews burying their work. For users, it means more confidence in the choices they make.
How we built it We started by collecting data from everywhere we could—Kaggle, HuggingFace, old review corpora. We patched together 6,400 samples across hotels, restaurants, ads, and deceptive reviews, making sure each of our four categories had equal representation. From there, we cleaned and prepped every line. We looked at details others might overlook—length of the review, sentiment, even how well it matched the type of place it was supposed to describe. Then we combined those features with the language power of BERT embeddings, letting the model learn both meaning and context. It wasn’t smooth sailing. We coded, tested, broke things, and tried again. Each mistake made the model a little sharper, a little closer to seeing reviews the way we do: some true, some hollow, some misleading.
The challenges The hardest part? Data. Policy-violating reviews are rare—most platforms delete them quickly. Balancing classes felt like walking a tightrope; too much of one kind, and the model would tilt unfairly. And even once we had accuracy, another challenge remained: how could we make people trust the model’s decision? So we scaled with few-shot labeling through the OpenAI API, and added explainability tools. With SHAP, users could literally see which words influenced the decision. Not just an answer, but a reason.
What we achieved Finally, we had something we were proud of: a model with nearly 99% accuracy, balanced across all four categories. And more than just numbers, we had a working tool—deployed with FastAPI and a React frontend—where anyone could paste a review, hit enter, and instantly see what the model thought, highlighted and explained.
What we learned We learned that truth online is fragile, but not impossible to protect. That balance is more valuable than sheer volume. And that people don’t just want accuracy—they want transparency. A result they can understand, not just accept.
What’s next This is just the beginning. We see ShearLuck reaching across platforms—Google Maps, TripAdvisor, TikTok—anywhere reviews live. Our vision is simple: a world where moderation doesn’t mean chasing violations, but preserving trust. Where businesses aren’t drowned in false complaints, and users don’t waste time wading through noise.
Because at the end of the day, ShearLuck isn’t just a model. It’s our promise: to cut through the noise and let the truth be heard.
Log in or sign up for Devpost to join the conversation.