TLDWatch - Too Long; Didn't Watch?
Elevator Pitch
TLDW? Let AI break down any video into chapters. No servers. Pure Chrome AI magic. Better learning, zero wait.
Inspiration
We've all been there—you're trying to learn something new from a 2-hour YouTube tutorial or online course, and you need to find that one specific section where the instructor explained a concept. You scrub through the timeline, skip forward, go back, and waste 10 minutes just trying to navigate.
We realized that while video is an incredible learning medium, it's fundamentally difficult to navigate compared to text.
At the same time, we saw Google's announcement of Chrome's Built-in AI APIs with Gemini Nano and immediately recognized the opportunity: what if we could automatically generate intelligent chapter breakdowns for any video, completely client-side, with zero server costs and complete privacy?
The traditional approach would require sending video transcripts to cloud APIs, costing money and raising privacy concerns. But with Chrome's built-in AI, we could process everything locally on the user's device.
The name "TLDWatch" came from the internet abbreviation "TL;DR" (Too Long; Didn't Read)—we're solving "Too Long; Didn't Watch" by making long-form video content as navigable as a well-structured article.
What It Does
TLDWatch is a Chrome extension that transforms how you consume educational video content by automatically generating AI-powered chapter breakdowns with timestamps, titles, and summaries—all processed locally on your device using Chrome's built-in Gemini Nano AI.
Core Features
✨ Automatic Chapter Generation
- Analyzes video transcripts using Gemini Nano's Prompt API
- Intelligently segments content into logical chapters with semantic understanding
- Generates descriptive titles and concise summaries for each chapter
- Processes content incrementally for fast, responsive results
⏱️ Smart Timestamps
- Every chapter includes precise, clickable timestamps
- Jump instantly to any section without scrubbing
- Real-time synchronization with video playback at any speed
🔒 Privacy-First Architecture
- All AI processing happens on-device using Chrome's Gemini Nano
- Your learning data never leaves your browser
- Zero server costs, no API quotas, no data collection
- Works completely offline once transcripts are loaded
- No tracking, no analytics, no user profiling
🌐 Multi-Platform Support
- Works seamlessly on YouTube, Coursera, Udemy, LinkedIn Learning
- Supports any platform with HTML5 video
- Custom integration logic per platform for optimal extraction
⚡ Built with Chrome Built-in AI APIs
- Prompt API: Generates chapter titles, summaries, structured insights, and educational quiz questions
- Summarizer API: Creates concise, coherent chapter descriptions from longer explanations
- Rewriter API: Refines and improves chapter titles for clarity, consistency, and engagement
How We Built It
Technical Architecture
Four Core Components:
Content Scripts
- Inject into video platform pages to detect video players
- Extract transcripts and captions from platform-specific APIs
- Insert chapter sidebar UI without breaking existing page styles
- Handle real-time video event listeners and synchronization
Background Service Worker
- Manages Chrome AI API sessions and lifecycle
- Coordinates between content scripts and popup interface
- Handles transcript processing pipeline
- Manages caching and storage optimization
AI Processing Pipeline
- Extract video transcript from platform-specific APIs with fallbacks
- Chunk transcript into manageable segments respecting Gemini Nano's context window
- Use Prompt API with carefully engineered prompts to identify chapter boundaries and semantic segments
- Generate descriptive titles and summaries via orchestrated API calls
- Apply Summarizer API to condense explanations while preserving meaning
- Use Rewriter API to improve title quality and consistency
- Cache results locally for instant retrieval
Interactive UI
- Glassmorphic sidebar with real-time video synchronization
- Smooth animations and responsive design
- Skeleton loading states for perceived performance
- Click-to-jump chapter navigation
- Responsive layout handling for different video player sizes
Technology Stack
Frontend: Vanilla JavaScript (lightweight, no dependencies)
APIs: Chrome Built-in AI (Gemini Nano)
- Prompt API (primary AI orchestration)
- Summarizer API (content condensation)
- Rewriter API (title refinement)
Storage: Chrome Storage API (local persistence)
Styling: CSS3 with Shadow DOM isolation
Architecture: Chrome Manifest V3
Format: Production-ready Chrome Extension
Platform Integration
- YouTube: IFrame API + auto-caption extraction + transcript fallback
- Coursera: DOM scraping with custom transcript parsing + subtitle detection
- Udemy: WebVTT subtitle parsing + mutation observers for dynamic content
- LinkedIn Learning: Caption extraction with platform-specific selectors + offset calculation
- Fallback: Generic HTML5 caption parsing for any video platform
Challenges We Ran Into
1. Chrome AI API Availability
Chrome's built-in AI APIs are still experimental—only available in Chrome Canary/Dev with specific feature flags enabled. We had to:
- Build comprehensive capability detection to check API availability
- Implement graceful degradation and fallback strategies
- Handle API initialization failures and timeout scenarios
- Work within Gemini Nano's context window limits (approximately 8,000 tokens)
- Manage API rate limits and request sequencing
2. Platform-Specific Transcript Extraction
Every platform handles transcripts, captions, and metadata differently. We built:
- Platform detection system to identify the current video hosting service
- Custom extractors for each major platform
- Fallback extraction logic for generic HTML5 video players
- Handling for auto-generated vs. human-created captions
- Support for multiple subtitle tracks and languages
3. Real-Time Video Synchronization
Keeping the sidebar perfectly synchronized with playback while handling user interactions required:
- Smart time sampling every 500ms to detect seeking events
- Smooth highlight updates without jarring jumps
- Handling of variable playback speeds (0.25x to 2x)
- Buffering and loading state management
- Prevention of listener memory leaks and duplicate handlers
4. Content Script Injection Without Breaking Pages
Integrating our UI without breaking existing page styles and functionality required:
- Shadow DOM encapsulation to prevent CSS conflicts
- Calculated safe injection points in the DOM
- Responsive layout handling for different video player dimensions
- Event listener isolation and cleanup
- Compatibility testing across different platform architectures
5. AI Generation Speed and Performance
Users expect instant results. We implemented:
- Incremental chapter generation (showing results as they arrive)
- Skeleton loading states for perceived performance
- Local storage caching to eliminate re-processing
- Transcript chunking to stay within context window limits
- Request batching and prioritization strategies
6. Prompt Engineering for Consistent Quality
Getting AI-generated chapters to consistently meet quality standards required:
- Iterative prompt refinement focused on semantic understanding
- Emphasis on key concepts vs. memorization questions
- Clear formatting specifications for structured outputs
- Temperature and instruction tuning
- Testing across diverse video content types
What We're Proud Of
✅ Real Chrome AI Integration - Not a demo or proof-of-concept. Actual integration of Prompt, Summarizer, and Rewriter APIs with production-quality orchestration.
✅ Multi-Platform Support - Works on 4+ major learning platforms with custom integration logic per platform, not a one-size-fits-all approach.
✅ Production-Quality UI/UX - Polished, professional interface with glassmorphic design, smooth animations, responsive layouts, and accessibility considerations.
✅ Privacy-First by Design - Zero data sent to servers. No tracking. No account required. User data stays on device.
✅ Intelligent AI Orchestration - Sophisticated combination of multiple AI APIs (Prompt, Summarizer, Rewriter) working in concert to understand content structure and generate meaningful breakdowns.
✅ Complete Open Source - Clean, well-documented code with comprehensive README, setup instructions, MIT license, and community-ready structure.
✅ On-Device AI Leadership - Proves that on-device AI isn't just possible—it's better than cloud-based alternatives for privacy, cost, and responsiveness.
✅ Built Fast - Entire project completed in under 5 hours, demonstrating efficient development and clear architectural thinking.
What We Learned
Technical Insights:
- Chrome Extension Manifest V3 best practices and gotchas
- Prompt engineering techniques for consistent, reliable outputs
- Platform API integration patterns and workarounds
- Real-time synchronization optimization strategies
- Shadow DOM and content script isolation techniques
Design Principles:
- Glassmorphic UI reduces visual intrusion while maintaining usability
- Progressive enhancement ensures graceful degradation
- Loading states dramatically improve perceived performance
- Responsive design is non-negotiable for extensions
Product Thinking:
- Privacy is a genuine competitive advantage and user differentiator
- Multi-platform support is challenging but essential for reach
- Navigation problems are underrated pain points in digital learning
- On-device processing builds user trust and reduces adoption friction
AI Development:
- Context windows are real technical constraints
- Specific, well-engineered prompts vastly outperform vague instructions
- API orchestration (combining multiple AI models) enables better outcomes than single-API approaches
- Temperature and instruction tuning significantly impact output quality
What's Next
Short-term (Next Release):
- Chrome Web Store launch and public availability
- Expand to edX, Khan Academy, Skillshare, and other platforms
- Offline chapter library with search functionality
- Chapter export in multiple formats (PDF, Markdown, Notion, Google Docs)
Medium-term (3-6 Months):
- Multi-language support for international learners
- Smart playlists and learning paths across multiple videos
- Study mode with AI-generated flashcards and practice questions
- Enhanced accessibility features (ARIA labels, keyboard navigation, screen reader support)
- Dark mode and customizable UI themes
Long-term (6+ Months):
- Visual frame analysis using multimodal capabilities
- Learning analytics dashboard tracking progress and retention
- Browser extension ecosystem expansion with API plugins
- Community-contributed chapter database and peer review system
- Integration with learning management systems (LMS) and note-taking apps
North Star Vision
Make every educational video as navigable and learnable as a well-written textbook, while never compromising on privacy or requiring expensive cloud infrastructure.
We believe on-device AI is the future of web applications. TLDWatch demonstrates that this future isn't just possible—it's demonstrably better for users, for privacy, for accessibility, and for developers.
APIs Used
TLDWatch leverages three core Chrome Built-in AI APIs, orchestrated together to create intelligent chapter breakdowns:
| API | Purpose | Implementation |
|---|---|---|
| Prompt API | Primary AI orchestration for chapter identification, title generation, summary creation, and insight extraction | Engineered prompts analyze transcript segments to identify semantic boundaries and generate structured outputs |
| Summarizer API | Condenses longer explanations and chapter descriptions into concise, coherent summaries | Applied to chapter explanations to create brief, scannable summaries while preserving key information |
| Rewriter API | Improves and refines chapter titles for clarity, consistency, and engagement | Enhances AI-generated titles to be more descriptive and user-friendly |
All APIs operate on-device using Gemini Nano, ensuring zero data transmission to external servers.
Tech Stack Summary
- Extension Framework: Chrome Manifest V3
- AI Engine: Chrome Built-in Gemini Nano
- Prompt API for AI orchestration
- Summarizer API for content condensation
- Rewriter API for title refinement
- Frontend: Vanilla JavaScript (zero dependencies)
- Styling: CSS3 with Shadow DOM isolation
- Storage: Chrome Storage API for local persistence
- Supported Platforms: YouTube, Coursera, Udemy, LinkedIn Learning, HTML5 video
- Development: Production-ready code, MIT licensed, fully open source
License
MIT - Open source and community-driven. Anyone can use, modify, and distribute TLDWatch under the terms of the MIT license.
Links
- GitHub: https://github.com/KillMonga130/TLDWatch
- Demo Video: [Include video demo link]
- Chrome Web Store: Coming soon
- Live Testing: Instructions in GitHub README
Team
Mueletshedzi Moses Mubvafhi - Full Stack AI Engineer
Development Time: Under 5 hours
Competition: Google Chrome Built-in AI Challenge 2025

Log in or sign up for Devpost to join the conversation.