Inspiration
We've all been there. You're researching a new product, planning a trip, or writing a paper. Your browser groans under the weight of 30 open tabs. The information you need is right there, but it's scattered, overwhelming, and impossible to synthesize. This "tab chaos" is a universal productivity killer, turning our most powerful research tool—the browser—into a source of stress. Our inspiration was to build a tool that transforms this chaos into clarity, using the power of AI to read, clean, and understand all your open tabs, giving you a single, intelligent answer.
What it does
TabSage is an intelligent Chrome extension that acts as your personal AI research assistant. It uses the Google Gemini API to instantly analyze, clean, and understand all your selected tabs.
Prompt or Click: You can type a custom prompt (e.g., "Compare the specs and prices of these laptops") or simply click a quick-action button like "Summarize" or "Analyze."
Intelligent Processing: TabSage runs a multi-phase pipeline that is fully visible to the user:
Fetches Content: It uses the Chrome Scripting API to extract the full text from all selected tabs.
Cleans Data: It sends the raw text to the Gemini API to remove all ads, navigation menus, footers, and other "clutter," returning only the core, useful content.
Caches Data: It stores this clean, structured JSON data in chrome.storage.local for speed and to avoid re-processing.
Detects Intent: It uses the Gemini API again to analyze your prompt and the tab titles, intelligently determining if your true goal is to compare, summarize, or analyze.
Delivers Insight: It then sets the stage for the final step, where it will generate a clear, actionable report (like a comparison table or a bulleted summary) that is precisely tailored to your detected intent.
How we built it
TabSage is built with a modern, modular tech stack, with Google technologies at its core:
Platform: Google Chrome Extension (Manifest V3), using chrome.tabs, chrome.scripting, and chrome.storage APIs.
Core AI: Google Gemini API (specifically gemini-1.5-flash-latest) accessed via the official @google/generative-ai SDK.
Frontend: React with Material-UI (MUI) for a responsive, beautiful, and theme-aware (light/dark mode) interface. The entire UI is built as a single, state-of-the-art React application.
Key Architecture: We designed a modular, multi-phase processing pipeline in React. App.js acts as a state machine, sequentially rendering different components (TabFetcher, CacheStorer, IntentDetector) that each handle one asynchronous step of the workflow.
Challenges we ran into
The "Transparent Window" Bug: Our biggest UI challenge. We wanted a sleek, rounded-corner UI, which required making the extension's body background transparent. However, Material-UI's component kept repainting the body with a solid white or black background, re-introducing sharp edges. We solved this by overriding the MUI theme's MuiCssBaseline styles to explicitly force the body's backgroundColor to be transparent !important.
Reliable Sequential State: Building the multi-phase processor was complex. We had to ensure TabFetcher finished and passed its data before CacheStorer ran, and so on. We solved this by creating a robust handler (handleNextPhase) in App.js that acts as a single control point, passing data and incrementing the phase index only upon the successful completion of the previous step.
AI-powered Classification: Getting Gemini to only return one of the three valid words (compare, summarize, analyze) for the intent detection was a fun prompt engineering challenge. We had to iterate on the prompt to be extremely specific, and we built in client-side validation as a fallback, just in case.
Accomplishments that we're proud of
The Multi-Step AI Pipeline: We're incredibly proud that we didn't just use Gemini for a single task. We built an intelligent workflow where one AI call (cleaning) prepares the data for a second AI call (intent detection). This is a sophisticated use of the AI that results in a much smarter, more accurate product.
The "Clean Text" Feature: Using Gemini to remove web page clutter is a game-changer. It makes the final analysis infinitely more accurate and relevant, as the AI isn't distracted by ads or navigation links. This step alone provides huge value to the user.
The Polished UI/UX: The final app, with its beautiful gradient border, smooth dark mode, animated icon, and interactive phased-processing view, feels like a professional, "shippable" product, not just a hackathon prototype. The user is never left guessing what the extension is doing.
What we learned
Gemini as a "Utility" AI: We learned that Gemini isn't just for generating prose. It's an incredible tool for classifying (intent detection) and extracting (text cleaning) data. Using a fast model like gemini-1.5-flash for these specialized, utility tasks is incredibly powerful and efficient.
Chrome Extension Architecture: We gained a deep understanding of the Manifest V3 architecture, especially how to bridge the gap between a modern React frontend and the asynchronous, promise-based Chrome APIs (chrome.scripting, chrome.storage).
Debugging Modern Frontend Stacks: The CssBaseline bug taught us a valuable lesson in how global styles and component-scoped styles (CSS-in-JS) interact in a complex React/MUI application, and how to find and override the exact style that is causing the problem.
What's next for TabSage
Implement the Final AI Step: The last step (FinalAnalysisComponent) is currently a mock. The next and most exciting step is to build the final prompts that take the cleaned data (fetchedTabData) and the detected intent (activeAction) to generate the actual comparison table, summary, or analysis for the user.
Persistent Cache Check: Before fetching, check chrome.storage for already-cleaned data. If the data for a tab is already cached and clean, we can skip the fetch and clean steps entirely, making the process near-instantaneous on subsequent runs.
Chat & Follow-up: After the report is generated, allow the user to ask follow-up questions (e.g., "Of these, which one has the best battery life?"), creating a full conversational experience.
Service Worker Integration: Move the heavy AI API calls to a Chrome service worker, allowing the analysis to continue in the background even if the user closes the popup.
Log in or sign up for Devpost to join the conversation.