TikTok Geo-Compliance Co-pilot: Project Story

Inspiration

We were inspired by the growing complexity of global regulation around social media features. Each country and even individual U.S. states impose unique requirements: Europe’s DSA, California’s **SB-976, Florida’s **HB-3, Utah’s **SB-194/HB-464, and U.S. federal obligations like **18 U.S.C. §2258A (NCMEC reporting).
Traditionally, teams relied on manual reviews and legal consultation to decide whether a feature required geo-specific logic. This process is slow, error-prone, and hard to audit.

Our inspiration was:

Can we turn guesswork into governance?
Can an AI-powered co-pilot analyze feature artifacts and highlight compliance needs before launch?
Can we generate audit-ready evidence so TikTok can answer regulators with confidence?

That led us to build TikTok Geo-Compliance Co-pilot.

What it does

The system ingests feature descriptions, PRDs, and TRDs and outputs:

Classification: YES / NO / UNCERTAIN — does this feature require geo-specific compliance logic?
Evidence-based reasoning: concise explanation grounded in retrieved regulations
Optional regulatory mapping: links to specific laws or sections
Audit-ready trail: CSV outputs with model version, KB version, and timestamp

A clean Streamlit UI lets users paste or upload artifacts, click Analyze, and instantly review compliance flags.

How we built it

Knowledge Base (KB): We curated and chunked legal text from DSA, CA SB-976, FL HB-3, UT SB-194/HB-464, and NCMEC reporting rules. These were indexed in a RAGFlow instance.
Retrieval-Augmented Generation (RAG): Queries from features run against the KB using semantic search + optional reranking with OpenAI.
LLM Classifier: The model outputs strictly formatted JSON (YES/NO/UNCERTAIN + reasoning + cited regulations).
Evidence Reranker: An optional reranking step improves precision by boosting the most relevant legal snippets.
Frontend: Built in Streamlit with TikTok-themed styling, allowing easy input and visualization.
Audit Outputs: All decisions are saved as CSVs with metadata for traceability.

Challenges we ran into

Jargon & Ambiguity: Internal feature codenames and vague PRDs often lacked explicit signals. This risked false negatives.
Overlaps in Laws: Multiple jurisdictions regulate the same theme (e.g., youth protections). Ensuring correct mapping was tricky.
Strict JSON Outputs: LLMs sometimes drifted into prose; enforcing schema validation required multiple prompt iterations.
RAG Quality: Without strong retrieval, models hallucinated. We learned the retrieval quality often mattered more than the LLM itself.
Performance & Cost: Balancing recall with cost meant careful prompt design and chunk sizing.

Accomplishments that we're proud of

Delivered a working compliance co-pilot that turns regulatory detection into a traceable, auditable system.
Built a clean, demo-ready UI that non-technical users can understand in minutes.
Integrated RAGFlow with reranking for grounded, evidence-backed answers.
Created a feedback loop where uncertain outputs (UNCERTAIN) can be flagged for human review, ensuring extensibility.
Produced a project that directly aligns with TikTok’s governance priorities while showcasing modern AI agent design.

What we learned

Grounding > Guessing: Retrieval-augmented pipelines dramatically reduce hallucinations.
Recall is king: For compliance, missing a regulation (false negative) is more costly than over-flagging (false positive).
Schema enforcement: Forcing JSON outputs with validation is key to building audit-ready AI.
Evaluation matters: Beyond accuracy, we measured coverage and groundedness. For example, we targeted recall $R \geq 0.9$ and precision $P \geq 0.6$, balancing cost of triage with regulatory risk.
Human-in-the-loop: Even the best AI agent benefits from structured human review, especially for ambiguous edge cases.

What's next for TikTok Geo-Compliance Co-pilot

Expand regulations: Add more jurisdictions (Brazil data localization, GDPR, India IT rules).
Code-level analysis: Go beyond text artifacts by scanning feature flags and geo-gates in source code.
Automated KB refresh: Periodically scrape authoritative sources, with human verification, to keep the KB current.
Fine-tuned models: Train small domain-adapted models for higher precision without sacrificing recall.
Deployment readiness: Wrap into a scalable, multi-tenant architecture with access control, dashboards, and audit exports.

Ultimately, this project showed how AI + RAG can transform compliance from a reactive burden into a proactive, transparent system.