Inspiration

Our project was inspired by two key motivations:

  • Giving back to the TikTok community by providing meaningful suggestions to enhance safety and usability.
  • Addressing the concerning trend of oversharing on social media. Platforms should serve as safe, creative outlets where users feel protected as they express themselves.

What It Does

The application allows users to upload videos, automatically transcribes the spoken content, and analyzes it to detect any Personally Identifiable Information (PII). When PII is identified, the system flags it to alert users, helping them avoid unintentionally sharing sensitive details.


How We Built It

  • Frontend Development: Implemented using Expo Go to build a user-friendly interface.
  • Speech-to-Text Pipeline: Used Whisper (OpenAI) to transcribe audio content from videos into text with high accuracy.
  • Model Integration: Combined the strengths of multiple AI models to achieve robust detection:
    • Stanford AIMI Deidentifier — for reliable PII detection.
    • BERT Named Entity Recognition (NER) — for entity extraction and classification.
    • Isotonic DeBERTa AI4Privacy — for enhanced privacy-focused detection.
  • Real-Time Processing: Leveraged background job processing with status tracking to deliver timely feedback.
  • Containerized Deployment: Packaged services in Docker for scalable and reproducible deployment.
  • Model Caching: Enabled persistent model storage to reduce latency and avoid unnecessary re-downloads.

Challenges We Faced

  • Model integration: Different AI models had overlapping outputs and inconsistent confidence scores, which required additional logic to reconcile results.
  • Performance bottlenecks: Running large models like Whisper and DeBERTa on video content was resource-intensive, forcing us to implement caching, batching, and background job processing.
  • Deployment complexity: Packaging multiple models and services into a containerized environment introduced dependency and memory management issues.
  • Data variability: Real-world TikTok videos had diverse accents, background noise, and informal language, which reduced transcription and detection accuracy.

Accomplishments That We're Proud Of

  • Creating a functional app that can upload videos, process them, and return results end-to-end.
  • Successfully integrating multiple AI models (Whisper, Stanford AIMI, BERT, DeBERTa) into a single workflow.
  • Achieving reliable PII detection across varied video and audio inputs.
  • Designing a user interface that makes privacy insights accessible and easy to understand.

What We Learned

  • How to integrate multiple AI models (Whisper, BERT, DeBERTa, Stanford AIMI) into one workflow and handle overlapping outputs.
  • The importance of performance optimization, including model caching, background job processing, and containerization.
  • Best practices for making privacy results clear and actionable to end users.
  • Practical considerations when applying AI to real-world social media content, such as accuracy trade-offs and scalability.

What's Next

  • Expand support for additional file types and social media platforms.
  • Improve detection accuracy by fine-tuning models on larger, domain-specific datasets.
  • Add user controls to customize what kinds of PII are flagged.
  • Optimize system performance for lower latency and reduced resource usage.

Built With

Share this project:

Updates