Inspiration
Online reviews shape where we eat, what we buy, and how we travel — yet a large portion are fake, promotional spam, or irrelevant noise. I noticed that many existing filtering systems are rule-based, which makes them brittle against increasingly sophisticated manipulation tactics. This inspired me to build RealViews, an ML-powered system that can reliably detect fake reviews, ensuring consumers and businesses can trust the information they rely on.
What it does
RealViews automatically detects and filters policy-violating reviews such as fake content, advertisements, spam, and irrelevant text. It provides each review with a quality score (0–1), generates explainable AI outputs with confidence levels, and supports multilingual analysis (English and Chinese) in real time (<100ms per review). The system also analyzes user metadata and temporal patterns to flag coordinated fake review campaigns.
How I built it
I collected and processed ~4,772 labeled reviews across English and Chinese, extracted 25+ linguistic, sentiment, and behavioral features, and trained ensemble classifiers (logistic regression, random forest, gradient boosting) with F1 scores above 0.86. I then deployed the system as a Streamlit web app, integrating Google Translate API for cross-language support and Plotly dashboards for interactive visualizations of suspicious review clusters.
Challenges I ran into
The biggest hurdles were limited datasets (especially in Chinese), manual labeling that was both time-consuming and inconsistent, computational constraints that made large transformer training infeasible, and the time pressure of building a working prototype. Despite these challenges, RealViews demonstrated strong accuracy, scalability, and real-time performance.
Accomplishments that I am proud of
Achieved 86%+ F1 score across multiple violation categories.
Designed an explainable AI interface so predictions aren’t just black-box outputs.
Proved that lightweight ML can scale without requiring massive infrastructure.
What I learned
I learned how to effectively combine NLP, metadata analysis, and ensemble ML into one system. I also deepened my understanding of cross-lingual NLP challenges, particularly around Chinese tokenization and translation pipelines. Most importantly, I learned how crucial explainability and user trust are when deploying AI for real-world decision-making.
What's next for RealViews
In the future, RealViews can be scaled with larger multilingual datasets (Spanish, Arabic, French, etc.) and improved through reinforcement learning with user feedback loops. I also plan to integrate LLM-based reasoning for deeper context analysis and expand the system into commercial APIs that platforms can use directly to safeguard their review ecosystems.

Log in or sign up for Devpost to join the conversation.