20 seconds to comply

Inspiration

Compliance review in media is stuck at 1x at best -- and then there's documentation toil on top. At broadcast scale (hundreds of hours daily), this doesn't work. Violations get missed, fines get paid, and brands pull ad spend from platforms they can't trust.

Steve lives with this pain at PlutoTV. The pain is real: FCC fines start at $500,000 per violation. Brand safety failures cost advertisers $2.8 billion annually. And every new regulation — the EU Digital Services Act, UK Online Safety Act, upcoming US platform accountability legislation — adds another framework to check against.

One more thing that made this the most interesting challenge of the three -- it's seemingly straightforward, but involves lots of moving parts. A perfect test case for the Claude Code "Hackathoner" plugin I've been wanting to build and try.

The name is a nod to RoboCop's ED-209: "You have 20 seconds to comply." Except our system isn't going to run amok and kill all the humans.

What it does

Upload a video. Select one or more compliance frameworks. Get a timestamped violation report with severity classifications, frame-level evidence, and direct links to the relevant policy sections. Then quickly scan through and confirm or dismiss each finding.

Six frameworks, one scan. GARM Brand Safety, FCC Broadcast Standards, YouTube Content Policy, MPAA/TV Parental Guidelines, TikTok Community Guidelines, and OFCOM/BCAP Broadcast Codes. Scan against one or all simultaneously. A single upload produces a unified compliance report — no separate review passes per framework.

Context-aware exemptions. A documentary about war contains violence. A news broadcast discusses drugs. A medical education video shows nudity. Raw detection flags all of these. 20 Seconds to Comply analyzes editorial context and automatically downgrades findings when content qualifies for journalistic, educational, artistic, or documentary exemptions. This dramatically reduces false positives.

Human-in-the-loop review. Every finding can be confirmed, dismissed, or escalated. Decisions show on the video timeline (checkmarks, greyed-out, exclamation marks) and persist across re-scans. When content is re-indexed or scanned against updated frameworks, previous reviewer judgments carry forward — building institutional knowledge over time.

Policy reference links. Every violation cites the specific regulatory clause it violates — FCC consumer guides, YouTube Help Center articles, GARM framework sections. Reviewers never have to look up whether something is actually a violation.

Custom rules. Create compliance templates from framework presets, add brand-specific rules (competitor logos, restricted product categories, tone guidelines), and toggle individual rules on or off.

How we built it

https://github.com/fshot/20seconds

TwelveLabs Marengo + Pegasus is the core of the detection pipeline. Marengo handles visual content understanding — violence, nudity, substance use, weapons, dangerous activities. Pegasus analyzes audio context for hate speech, misleading claims, and restricted content. Every compliance rule maps to a Marengo search query; results come back with timestamps and confidence scores.

AWS Lambda handles all heavy lifting asynchronously. API routes on Vercel are thin dispatchers — they validate input, write initial state to DynamoDB, and invoke Lambdas via fire-and-forget (InvocationType: "Event"). Three Lambdas: thumbnail generation (S3-triggered), TwelveLabs indexing, and compliance scanning. No timeouts, no blocking.

AWS Transcribe runs in parallel as a secondary audio signal. It produces word-level timestamped transcripts stored in S3, checked for profanity during scan — catching spoken content that semantic search might miss.

Next.js 15 on Vercel with Tailwind CSS and shadcn/ui. The review interface features a seekable video timeline with severity-colored violation bars, keyboard navigation between findings, and inline review controls.

Terraform for all AWS infrastructure — S3, DynamoDB, Lambda (with ECR-based Docker images), IAM roles, CORS, and event triggers. 513 lines of IaC.

Claude Code with the hackathoner plugin managed the entire development lifecycle. The /hack command automated the loop: read the tracking issue, find the highest-priority unblocked GitHub Issue, check for a design plan (write one if missing), create a git worktree, implement, and open a PR. 190 commits in 23 hours. 31 design documents before code was written. 11 custom skills that the hackathoner plugin researched and wrote into the repo provided instant domain expertise on broadcast compliance standards, content rating systems, brand safety frameworks, and sponsor tool APIs.

Challenges we ran into

TwelveLabs had a partial outage Saturday night. Index creation calls were failing intermittently during our core build window. We built a mock mode early (fortunate timing) and toggled between real and mock APIs as stability fluctuated. The mock mode became permanently valuable for local development.

The UX pivot that should have killed us. We started with a 4-step wizard flow. Midway through Saturday, we realized it was wrong — too many clicks to get to the thing that matters (the violation report). We threw it away and rebuilt as a single-page video library inspired by Notion. 15 commits, 2 hours of rebuild.

False positives almost embarrassed us. Sunday morning, 3.5 hours before deadline, war footage in news clips, and classical art nudity (Michelangelo's David, Louvre statues) were triggering violation flags. We built context-aware exemptions — educational, artistic, journalistic, sports — in under an hour. It went from "we have a demo problem" to merged PR before 10 AM.

Brute-force search costs. One Marengo search per rule per framework means dozens of API calls per clip. With six frameworks enabled, a single video scan fans out to 40+ searches. We identified the optimization path (deduplicate searches for rules that target the same underlying concern across frameworks) but shipped the brute-force version to meet deadline.

Accomplishments that we're proud of

190 commits in 23 hours. 8.3 commits per hour, sustained across an overnight build. 18 PRs merged. 41 issues filed and tracked. 31 design documents. Two people, one repo, no sleep.

Six compliance frameworks with real regulatory citations. Not placeholder rules — actual FCC section references, GARM category IDs, OFCOM Broadcasting Code clauses. A compliance officer could audit our rule definitions against the source documents.

Context exemptions that actually work. The system doesn't just detect — it reasons about context. A violence detection in a news broadcast gets downgraded. Nudity in classical art gets exempted. This is the difference between a toy demo and something a real compliance team would trust.

Review decisions that persist. The human-in-the-loop workflow isn't a checkbox — decisions survive re-scans, carry forward across framework updates, and cascade across frameworks when the same violation appears in multiple rulesets.

Production infrastructure, not localhost. Deployed on Vercel with real AWS infra (S3, Lambda, DynamoDB, Transcribe), Terraform-managed, presigned URLs for secure video transfer. This is a system you could hand to a DevOps team, not a demo that only runs on a laptop.

What we learned

Plan-driven development is faster, even in a hackathon. It sounds bureaucratic, but 2-3 minutes writing a design doc prevents 30-minute dead ends. When we skipped planning (early Saturday), we got spaghetti. When we planned first, implementation was focused.

TwelveLabs Marengo is remarkably capable for compliance use cases. Natural language search queries ("person holding a gun," "alcoholic beverage in a scene with minors") return timestamped results with usable confidence scores. The gap between what we searched for and what it found was smaller than expected.

The false positive problem is the real product. Detection is table stakes. The value is in reducing false positives to a level where human reviewers trust the tool enough to use it daily. Context exemptions, confidence thresholds, and severity calibration are where the product lives.

AI-assisted development at hackathon scale is a force multiplier. Claude Code didn't replace the developer — it replaced the 3-4 other engineers a project like this would normally need. The skill system and plan-driven workflow turned domain research from a day of reading regulatory documents into 30 minutes of structured output.

What's next for 20 Seconds to Comply

Search deduplication. Many rules across different frameworks target the same underlying concern — GARM "Arms & Ammunition," FCC "Dangerous Activities," and YouTube "Firearms" all need the same visual search. Mapping rules to canonical searches and fanning results back to applicable frameworks could reduce API calls by 40-60%.

API and MAM integration. RESTful API and MCP server to plug compliance into existing content management and media asset management platforms. Compliance review inside the tools teams already use, not a separate application.

Multi-tenancy and audit trail. Role-based access, SSO, immutable decision logs with timestamps and reviewer attribution for SOC 2 and regulatory compliance documentation.

Continuous learning from reviewer feedback. Confirmed and dismissed violations become training signal. Over time, the system learns which findings a specific organization's reviewers consistently dismiss — and adjusts confidence thresholds accordingly.