TLDWatch - Too Long; Didn't Watch?

Elevator Pitch

TLDW? Let AI break down any video into chapters. No servers. Pure Chrome AI magic. Better learning, zero wait.

Inspiration

We've all been there—you're trying to learn something new from a 2-hour YouTube tutorial or online course, and you need to find that one specific section where the instructor explained a concept. You scrub through the timeline, skip forward, go back, and waste 10 minutes just trying to navigate.

We realized that while video is an incredible learning medium, it's fundamentally difficult to navigate compared to text.

At the same time, we saw Google's announcement of Chrome's Built-in AI APIs with Gemini Nano and immediately recognized the opportunity: what if we could automatically generate intelligent chapter breakdowns for any video, completely client-side, with zero server costs and complete privacy?

The traditional approach would require sending video transcripts to cloud APIs, costing money and raising privacy concerns. But with Chrome's built-in AI, we could process everything locally on the user's device.

The name "TLDWatch" came from the internet abbreviation "TL;DR" (Too Long; Didn't Read)—we're solving "Too Long; Didn't Watch" by making long-form video content as navigable as a well-structured article.

What It Does

TLDWatch is a Chrome extension that transforms how you consume educational video content by automatically generating AI-powered chapter breakdowns with timestamps, titles, and summaries—all processed locally on your device using Chrome's built-in Gemini Nano AI.

Core Features

✨ Automatic Chapter Generation

Analyzes video transcripts using Gemini Nano's Prompt API
Intelligently segments content into logical chapters with semantic understanding
Generates descriptive titles and concise summaries for each chapter
Processes content incrementally for fast, responsive results

⏱️ Smart Timestamps

Every chapter includes precise, clickable timestamps
Jump instantly to any section without scrubbing
Real-time synchronization with video playback at any speed

🔒 Privacy-First Architecture

All AI processing happens on-device using Chrome's Gemini Nano
Your learning data never leaves your browser
Zero server costs, no API quotas, no data collection
Works completely offline once transcripts are loaded
No tracking, no analytics, no user profiling

🌐 Multi-Platform Support

Works seamlessly on YouTube, Coursera, Udemy, LinkedIn Learning
Supports any platform with HTML5 video
Custom integration logic per platform for optimal extraction

⚡ Built with Chrome Built-in AI APIs

Prompt API: Generates chapter titles, summaries, structured insights, and educational quiz questions
Summarizer API: Creates concise, coherent chapter descriptions from longer explanations
Rewriter API: Refines and improves chapter titles for clarity, consistency, and engagement

How We Built It

Technical Architecture

Four Core Components:

Content Scripts
- Inject into video platform pages to detect video players
- Extract transcripts and captions from platform-specific APIs
- Insert chapter sidebar UI without breaking existing page styles
- Handle real-time video event listeners and synchronization
Background Service Worker
- Manages Chrome AI API sessions and lifecycle
- Coordinates between content scripts and popup interface
- Handles transcript processing pipeline
- Manages caching and storage optimization
AI Processing Pipeline
- Extract video transcript from platform-specific APIs with fallbacks
- Chunk transcript into manageable segments respecting Gemini Nano's context window
- Use Prompt API with carefully engineered prompts to identify chapter boundaries and semantic segments
- Generate descriptive titles and summaries via orchestrated API calls
- Apply Summarizer API to condense explanations while preserving meaning
- Use Rewriter API to improve title quality and consistency
- Cache results locally for instant retrieval
Interactive UI
- Glassmorphic sidebar with real-time video synchronization
- Smooth animations and responsive design
- Skeleton loading states for perceived performance
- Click-to-jump chapter navigation
- Responsive layout handling for different video player sizes

Technology Stack

Frontend: Vanilla JavaScript (lightweight, no dependencies)
APIs: Chrome Built-in AI (Gemini Nano)
  - Prompt API (primary AI orchestration)
  - Summarizer API (content condensation)
  - Rewriter API (title refinement)
Storage: Chrome Storage API (local persistence)
Styling: CSS3 with Shadow DOM isolation
Architecture: Chrome Manifest V3
Format: Production-ready Chrome Extension

Platform Integration

YouTube: IFrame API + auto-caption extraction + transcript fallback
Coursera: DOM scraping with custom transcript parsing + subtitle detection
Udemy: WebVTT subtitle parsing + mutation observers for dynamic content
LinkedIn Learning: Caption extraction with platform-specific selectors + offset calculation
Fallback: Generic HTML5 caption parsing for any video platform

Challenges We Ran Into

1. Chrome AI API Availability

Chrome's built-in AI APIs are still experimental—only available in Chrome Canary/Dev with specific feature flags enabled. We had to:

Build comprehensive capability detection to check API availability
Implement graceful degradation and fallback strategies
Handle API initialization failures and timeout scenarios
Work within Gemini Nano's context window limits (approximately 8,000 tokens)
Manage API rate limits and request sequencing

2. Platform-Specific Transcript Extraction

Every platform handles transcripts, captions, and metadata differently. We built:

Platform detection system to identify the current video hosting service
Custom extractors for each major platform
Fallback extraction logic for generic HTML5 video players
Handling for auto-generated vs. human-created captions
Support for multiple subtitle tracks and languages

3. Real-Time Video Synchronization

Keeping the sidebar perfectly synchronized with playback while handling user interactions required:

Smart time sampling every 500ms to detect seeking events
Smooth highlight updates without jarring jumps
Handling of variable playback speeds (0.25x to 2x)
Buffering and loading state management
Prevention of listener memory leaks and duplicate handlers

4. Content Script Injection Without Breaking Pages

Integrating our UI without breaking existing page styles and functionality required:

Shadow DOM encapsulation to prevent CSS conflicts
Calculated safe injection points in the DOM
Responsive layout handling for different video player dimensions
Event listener isolation and cleanup
Compatibility testing across different platform architectures

5. AI Generation Speed and Performance

Users expect instant results. We implemented:

Incremental chapter generation (showing results as they arrive)
Skeleton loading states for perceived performance
Local storage caching to eliminate re-processing
Transcript chunking to stay within context window limits
Request batching and prioritization strategies

6. Prompt Engineering for Consistent Quality

Getting AI-generated chapters to consistently meet quality standards required:

Iterative prompt refinement focused on semantic understanding
Emphasis on key concepts vs. memorization questions
Clear formatting specifications for structured outputs
Temperature and instruction tuning
Testing across diverse video content types

What We're Proud Of

✅ Real Chrome AI Integration - Not a demo or proof-of-concept. Actual integration of Prompt, Summarizer, and Rewriter APIs with production-quality orchestration.

✅ Multi-Platform Support - Works on 4+ major learning platforms with custom integration logic per platform, not a one-size-fits-all approach.

✅ Production-Quality UI/UX - Polished, professional interface with glassmorphic design, smooth animations, responsive layouts, and accessibility considerations.

✅ Privacy-First by Design - Zero data sent to servers. No tracking. No account required. User data stays on device.

✅ Intelligent AI Orchestration - Sophisticated combination of multiple AI APIs (Prompt, Summarizer, Rewriter) working in concert to understand content structure and generate meaningful breakdowns.

✅ Complete Open Source - Clean, well-documented code with comprehensive README, setup instructions, MIT license, and community-ready structure.

✅ On-Device AI Leadership - Proves that on-device AI isn't just possible—it's better than cloud-based alternatives for privacy, cost, and responsiveness.

✅ Built Fast - Entire project completed in under 5 hours, demonstrating efficient development and clear architectural thinking.

What We Learned

Technical Insights:

Chrome Extension Manifest V3 best practices and gotchas
Prompt engineering techniques for consistent, reliable outputs
Platform API integration patterns and workarounds
Real-time synchronization optimization strategies
Shadow DOM and content script isolation techniques

Design Principles:

Glassmorphic UI reduces visual intrusion while maintaining usability
Progressive enhancement ensures graceful degradation
Loading states dramatically improve perceived performance
Responsive design is non-negotiable for extensions

Product Thinking:

Privacy is a genuine competitive advantage and user differentiator
Multi-platform support is challenging but essential for reach
Navigation problems are underrated pain points in digital learning
On-device processing builds user trust and reduces adoption friction

AI Development:

Context windows are real technical constraints
Specific, well-engineered prompts vastly outperform vague instructions
API orchestration (combining multiple AI models) enables better outcomes than single-API approaches
Temperature and instruction tuning significantly impact output quality

What's Next

Short-term (Next Release):

Chrome Web Store launch and public availability
Expand to edX, Khan Academy, Skillshare, and other platforms
Offline chapter library with search functionality
Chapter export in multiple formats (PDF, Markdown, Notion, Google Docs)

Medium-term (3-6 Months):

Multi-language support for international learners
Smart playlists and learning paths across multiple videos
Study mode with AI-generated flashcards and practice questions
Enhanced accessibility features (ARIA labels, keyboard navigation, screen reader support)
Dark mode and customizable UI themes

Long-term (6+ Months):

Visual frame analysis using multimodal capabilities
Learning analytics dashboard tracking progress and retention
Browser extension ecosystem expansion with API plugins
Community-contributed chapter database and peer review system
Integration with learning management systems (LMS) and note-taking apps

North Star Vision

Make every educational video as navigable and learnable as a well-written textbook, while never compromising on privacy or requiring expensive cloud infrastructure.

We believe on-device AI is the future of web applications. TLDWatch demonstrates that this future isn't just possible—it's demonstrably better for users, for privacy, for accessibility, and for developers.

APIs Used

TLDWatch leverages three core Chrome Built-in AI APIs, orchestrated together to create intelligent chapter breakdowns:

API	Purpose	Implementation
Prompt API	Primary AI orchestration for chapter identification, title generation, summary creation, and insight extraction	Engineered prompts analyze transcript segments to identify semantic boundaries and generate structured outputs
Summarizer API	Condenses longer explanations and chapter descriptions into concise, coherent summaries	Applied to chapter explanations to create brief, scannable summaries while preserving key information
Rewriter API	Improves and refines chapter titles for clarity, consistency, and engagement	Enhances AI-generated titles to be more descriptive and user-friendly

All APIs operate on-device using Gemini Nano, ensuring zero data transmission to external servers.

Tech Stack Summary

Extension Framework: Chrome Manifest V3
AI Engine: Chrome Built-in Gemini Nano
- Prompt API for AI orchestration
- Summarizer API for content condensation
- Rewriter API for title refinement
Frontend: Vanilla JavaScript (zero dependencies)
Styling: CSS3 with Shadow DOM isolation
Storage: Chrome Storage API for local persistence
Supported Platforms: YouTube, Coursera, Udemy, LinkedIn Learning, HTML5 video
Development: Production-ready code, MIT licensed, fully open source

License

MIT - Open source and community-driven. Anyone can use, modify, and distribute TLDWatch under the terms of the MIT license.

Team

Mueletshedzi Moses Mubvafhi - Full Stack AI Engineer

Development Time: Under 5 hours

Competition: Google Chrome Built-in AI Challenge 2025

Built With

Updates

Mueletshedzi Moses Mubvafhi started this project — Nov 01, 2025 12:41 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.

TLDWatch - "Too Long; Didn't Watch?"