TikTok TechJam 2025 – Devpost Writeup

1. Problem and Solution

Problem
Online reviews shape decisions about where people eat, shop, and travel. Unfortunately, many reviews are misleading — short spam (“Nice 👍”), disguised advertisements, or rants from people who have never visited the location. These reduce trust for users and unfairly harm businesses.

Solution
We built a machine learning (ML) pipeline with a demo web app that automatically flags reviews as Relevant, Spam, Advertisement, or Rant.

Model V1 (Machine Learning): Semantic embeddings + Principal Component Analysis (PCA) + Logistic Regression. Simple and interpretable.
Model V2 (Neural Network): Semantic embeddings + PyTorch Neural Network (NN). Captured deeper patterns and achieved 95.6% test accuracy with a weighted F1 score of 0.93, outperforming V1 across almost all categories.

2. Development Environment and Tools

Visual Studio Code (VSCode): Development, scripting, debugging
Google Colaboratory (Colab): GPU-based training and fast experiments
Flask + Jinja2 templates: Loading the demo web app
PyTorch: Building, training, and saving the neural network (Model V2)
Joblib: Saving and reusing PCA + Logistic Regression models (Model V1)

3. APIs Used

Google Maps API (googlemaps): Metadata like business descriptions to test review–description similarity
SentenceTransformers API (all-MiniLM-L6-v2): Generated semantic embeddings from review text

4. Libraries and Frameworks

Core ML / NLP: scikit-learn, sentence-transformers, torch, numpy, pandas
Web / API: flask, googlemaps, dotenv
Utilities: nltk, re, joblib, json

5. Assets and Datasets Used

Google Local Reviews datasets (Kaggle + UCSD)
70,000 manually-labelled reviews (Spam, Advertisement, Rant, Relevant)

Extra feature experiments:

Rating deviation (review rating vs. business average)
Review–business description similarity

Results showed some promise, but trade-offs made them less reliable than text-only embeddings.

6. Solution Flow

Step 1: Data Cleaning

Normalized review text: lowercased, stripped punctuation/emojis, removed stopwords, lemmatized words.
Ensured consistency across dataset.

Step 2: Semantic Embeddings

Used all-MiniLM-L6-v2 to convert reviews into dense vectors.
Captured meaning rather than just keywords (e.g., “great service” ≈ “amazing staff”).

Step 3a: Model V1 – Logistic Regression (Baseline)

PCA reduced embeddings (384 → 128 dimensions).
Logistic Regression classifier trained on compressed vectors.
Results: 94.0% accuracy; F1 scores – Relevant: 0.966, Spam: 0.795.

Step 3b: Model V2 – Neural Network (Final)

PyTorch NN: Linear → ReLU → Dropout → Linear (4-class output).
Trained with Adam optimizer + CrossEntropy loss.
Results: 95.6% accuracy; F1 scores – Relevant: 0.976, Spam: 0.865.

Step 4: Feature Engineering Experiments

Statistical analysis with metadata (rating, avg rating, #reviews, pictures, owner responses).
Findings:
- Owner responses correlated with Ads more than Spam or Rants.
- Ads generally had higher ratings than Spam or Rants.
Limitations: Small sample size (Ads n=93) reduced reliability.
Rating deviation and review–description similarity tested, but only rant detection benefitted.

Step 5: Evaluation

Metrics: Precision, Recall, F1.
Headline results: 95.6% test accuracy, Weighted F1 ≈ 0.93.
Relevant reviews preserved with very high recall (0.978).

Step 6: Flask Web App (UI)

Enter place → Select correct match → Load real reviews → Classify into categories.
Optional: Business description improves rant detection.
UI shows performance metrics.

7. How the Solution Addresses the Problem

Semantic understanding: Embeddings interpret meaning, not just keywords.
Policy alignment: Maps directly to TikTok’s moderation categories (Spam, Ads, Irrelevant, Rants).
Scalability: Neural networks scale well to large datasets.
Iteration-driven improvement: Compared classical ML vs. NN; NNs proved stronger.
Practical demo: Flask app + interactive UI for real-time deployment potential.

8. Conclusion

Our project shows how ML + NLP can make review platforms more trustworthy by filtering noise and surfacing genuinely helpful feedback.

Neural network achieved 95.6% accuracy, Weighted F1 ≈ 0.93.
High recall ensures valuable reviews are preserved.
Spam detection significantly improved platform quality.
Rants remain challenging, but experimental signals (review–description similarity) provide future opportunities.

Impact:

Users make better decisions with reliable reviews.
Businesses get fairer representation.
Platforms benefit from scalable, automated moderation.

9. Interactive Demo

Search: Enter the name of any place.
Select: Choose the correct match from a candidate list.
Classify: Loads reviews with classification tags (Relevant, Spam, Rant, Advertisement).
Optional description input: Improves rant detection.
See metrics: Model performance shown in UI.

🔗 GitHub Repo: Boolean Brotherhood
▶️ YouTube Demo: Watch here