Inspiration
We constantly open multiple browser tabs—news articles, YouTube videos, research papers, documentation, financial charts, ChatGPT and other models, —but keeping track of all this content and running back and forth the tabs to ChatGPT and back, it all becomes overwhelming.
What if your browser could automatically understand all your open tabs , categorize them intelligently, and answer questions using AI that comprehends your entire browsing context ? not forgetting that a good chunk of time spent on the browser on desktops is on YouTube , and what do people do as soon as they finish watching a video; they check the comments and try to see how the discussion is going, here is where another problem comes, they really cant get through 20-30 comments before they get bored so thats what i thought would be most impactfull in all this ,plus is all local and private .
How We Built It
Architecture: React 18 + TypeScript frontend with a service worker backend . We leverage Chrome Extension APIs extensively (chrome. Tabs, chrome. Scripting, chrome. Storage, Chrome built-in AI APIs).
AI Integration: Multi-provider system with intelligent fallbacks:
- Chrome Built-in AI (free, on-device) - First priority
- Gemini 2.0 Flash & Vision - For text and image analysis
- OpenAI GPT & Anthropic Claude - Fallbacks for complex reasoning
Content Processing: Custom extraction scripts identify main content areas, categorize pages through multi-stage classification (URL patterns → heuristics → AI analysis), and summarize using the best available AI. Images are extracted with strict filtering (excluding ads, icons) and analyzed asynchronously using vision APIs.
Special Features:
- YouTube Data API integration for video metadata, transcripts, and comments with automatic translation
- Image analysis using Gemini Vision API (filters out 90%+ of irrelevant images)
- Category-wide QA that synthesizes information across all tabs in a category
- Export capabilities (CSV/Excel with sentiment formatting, Markdown reports, Google Sheets/Docs integration)
Challenges We Faced
Service Worker Limitations: Stateless workers that can terminate anytime. Solved with comprehensive state persistence and re-entrancy guards.
Content Extraction: Every website has different HTML structure. Built robust extraction algorithm with multiple fallback strategies for identifying main content.
Image Filtering: Distinguishing relevant images from ads/icons. Implemented strict filtering based on dimensions, DOM context, URL patterns, and alt text. Used service worker fetch API to bypass CORS.
AI Rate Limits: Multiple providers with different limits. Prioritized Chrome AI (free), implemented smart caching, used extractive summaries as fallback, and debounced tab updates.
YouTube Translation: Comments in various languages. Integrated Chrome Translator API with heuristic language detection and external fallbacks.
Category-Wide QA: Synthesizing across tabs. Built sophisticated prompt engineering with context from all relevant tabs and external search for fact-checking.
UI Synchronization: Service worker and UI in separate contexts. Implemented comprehensive message passing system with proper error handling.
What We Learned
Technical: Deep understanding of Chrome Extension architecture, AI integration patterns, content extraction techniques, image processing, and performance optimization through debouncing and caching. Learned when to use different AI models for different tasks.
Product: Importance of non-blocking operations, clear feedback, and graceful error handling. Effective AI prompt engineering significantly improves output quality. Category-wide queries are more powerful than single-tab queries for research.
Architecture: Managing state across service worker and UI contexts, implementing retry mechanisms and fallbacks at every layer, and designing for scalability to handle dozens of tabs without performance degradation.
Impact
TabSense transforms browser tabs into a knowledge base. Users can automatically process all tabs, ask questions across their entire browsing session, understand visual content through AI analysis, and export insights for sharing. All paired with YouTube sentiment analysis, it demonstrates the power of AI to make information more accessible and actionable.
Built With
- anthropic-api
- chrome-extension-apis
- gemini-api
- googledocsapi
- googlesheetsapi
- newsapi
- openai-api
- serperapi
- tailwind-css
- typescript
- vite
- wikipedia-api
- youtube-data-api
Log in or sign up for Devpost to join the conversation.