Inspiration
The inspiration of the project is the idea of filtering out unnecessary reviews from the relevant ones automatically with the use of machine learning algorithms and LLM models. With this, people who wishes to visit a particular place can look at trustworthy reviews before deciding on their own whether to visit the place or not. It should be a safe space for customers to leave honest opinions about the place so everyone else is informed.
What it does
The BERT model will classify reviews into one of four categories: RELEVANT, ADVERTISEMENT, IRRELEVANT, RANT WITHOUT VISIT. It is trained on very carefully labelled data from a dataset repository by Google, the Google Local Review data.
How we built it
The model is a pre-trained model used for classification. The BERT model is fed with data, each category consisting of 1500 records. With this many examples, the model would be able to generalise reviews and classify them to the correct categories following fine-tuning processes.
Challenges we ran into
With big data, it was impossible to label our data manually. Using OpenAI API for automatic labelling also posed its own set of challenges as the prompts may not be understood well by the model, thus mistakenly labelling good reviews as bad and bad ones as good. Another challenge is that even with millions of records, with the dataset that we had, statistical methods and sentiment analysis could not select reviews that are rants without visits. This may be because of lack of examples of such reviews in the current dataset or statistical methods were not able to capture the reviews as efficiently as possibly manual review (though the pipeline for capturing rants without review would be there for further fine-tuning of model).
Accomplishments that we're proud of
The model achieved high and consistent metric scores across the board and it achieved only in just one training session at 10 epochs. This means that our data cleaning using statistical means was good enough for the model to ascertain between different review categories. Our accuracy, F1, precision and recall achieved around 70%.
What we learned
Applying the correct statistical techniques to aid in our labels is crucial as they are the first line of defence for us when we encounter large datasets. Large datasets are important to capture as many examples as we can before we filter to a few rows for model training.
What's next for 1. Filtering the Noise: ML for Trustworthy Location Reviews
Hope that our model is seen deployed at TikTok while reviewing comment sections or reviews from TikTok shop. It would be an honour to have it deployed in real-time, a testament to our great and hard work and achievement for the public. I would also like to continue working on the model to keep improving its classification capabilities with sophisticated rules and prompt engineering with LLM.
Built With
- apache-pyspark
- google-colab
- python
- spark-nlp
- visual-studio-code
Log in or sign up for Devpost to join the conversation.