What It Does
Kaypoh Aunty is a web application that transforms the experience of reading online reviews. It functions by:
- Scraping Data: It scrapes Google reviews in real-time.
- AI-Powered Analysis: Each review is then processed by a custom-trained machine learning model.
- Intelligent Categorization: The model automatically classifies the content into five distinct categories:
- Useful Reviews
- Advertisements
- Spam
- Rants Without Visits
- Irrelevant Content
The result is a clean, filterable interface that allows users to sift through the noise and focus on the reviews that truly matter.
How It Was Built
The application was constructed using a modern, API-driven architecture.
- Frontend: The user interface was built with HTML, CSS, and JavaScript, creating a lightweight and responsive experience.
- Data Scraping: We integrated the Apify API to handle the dynamic scraping of Google reviews.
- AI & Classification: The core of the project is a custom DistilBERT model. This model is hosted on a Hugging Face Space and is served via a dedicated Inference API. Our application's backend logic calls this API to classify reviews on the fly.
Accomplishments We're Proud Of
- High Classification Accuracy: The model achieved impressive accuracy, particularly in distinguishing clear-cut categories like "Spam" and "Advertisements" from genuine user feedback. This success is the foundation of the application's value.
- Seamless API Integration: We successfully orchestrated multiple services, creating a smooth data pipeline from the Apify scraper to our custom Hugging Face Inference API and back to the frontend.
- Effective User Experience: We translated a complex backend process (scraping and AI analysis) into a simple, intuitive, and genuinely useful tool for the end-user.
What's Next for Kaypoh Aunty
Our roadmap for future development is focused on enhancing the model's intelligence and expanding the application's features.
- Enhanced Model with Metadata: The next version of the model will be retrained on a richer dataset. We plan to incorporate metadata—such as the star rating, review length, and post frequency—as additional input features. We hypothesize this will improve accuracy for more nuanced classifications.
- Sentiment Analysis: We intend to add a sentiment analysis layer (Positive, Neutral, Negative) to provide users with another powerful filtering dimension.
- Broader Platform Support: We plan to expand the scraper's capabilities to analyze reviews from other platforms beyond just Google.
Built With
- css
- hmtl
- javascript
Log in or sign up for Devpost to join the conversation.