📌 Project Overview This project uses a BART-based zero-shot classification model to automatically classify Google reviews into four categories:

Relevant – Genuine reviews about the business Irrelevant – Off-topic reviews not related to the business Advertisement – Promotional content or links Rant_No_Visit – Complaints from users who never visited the business We built this as part of TikTok Tech Jam, where the focus is on solving real-world problems with creative AI solutions. The same approach could apply to TikTok comments, TikTok Shop reviews, and brand campaign feedback to separate meaningful content from noise.

⚙️ Methodology

  1. Model We used facebook/bart-large-mnli through Hugging Face’s pipeline for zero-shot classification. Instead of training on labeled data, we crafted prompts with candidate labels to guide the model.

  2. Evaluation Metrics Due to class imbalance, we used:

  3. Macro F1 → measures performance equally across all classes.

  4. Weighted F1 → accounts for imbalance by weighting scores by class frequency.

  5. Findings

  6. Strong performance on frequent classes (e.g., irrelevant).

  7. Struggles with minority classes (ads, rant_no_visit).

  8. Common confusion: irrelevant ↔ rant_no_visit, and subtle ads misclassified as relevant.

  9. Limitations

  10. Imbalanced dataset.

  11. Zero-shot models are sensitive to prompt design.

  12. BART is not fine-tuned for review/comment domain language.

  13. Improvements

  14. Fine-tune on labeled review/comment datasets.

  15. Balance the dataset with oversampling or class-weighted training.

  16. Experiment with newer instruction-tuned LLMs for better adaptation.

Built With

  • facebook/bart-large-mnli
  • llm
  • python
Share this project:

Updates