Gemini Page Insights: AI Summarizer for Chrome

Inspiration

Every day, millions of people face information overload — long articles, technical documents, news in foreign languages, or incomplete content that wastes precious time. As someone who reads content in Persian, English, and Arabic, I often struggled to quickly grasp the core ideas of a webpage without reading every word. This frustration inspired me to build Gemini Page Insights: an AI-powered Chrome extension that instantly transforms any webpage into a clear, concise, and intelligent summary — in under 5 seconds.

I wanted to create a tool that democratizes access to knowledge, especially for students, non-native speakers, busy professionals, and language learners who don’t have the luxury of time or fluency.

What I Learned

Building this extension taught me how to:

Integrate Google’s Gemini AI models directly into a browser environment while respecting privacy.
Use Readability.js to extract clean, semantic content from any webpage — stripping away ads, sidebars, and clutter.
Design a responsive, RTL-compatible UI that works flawlessly for both left-to-right (LTR) and right-to-left (RTL) languages like Persian and Arabic.
Implement Manifest V3 best practices with service workers, secure storage, and efficient background processing.
Optimize prompts for 9 specialized content categories (e.g., Medical, Programming, News) to dramatically improve summary relevance.

How I Built It

Gemini Page Insights is built as a Chrome extension using Manifest V3. Here’s the architecture:

Content Extraction: When the user clicks the floating ✨ button, the extension uses Readability.js to isolate the main article content.
Language Detection: Automatically detects the page language (supports 13+ languages).
AI Processing: Sends the cleaned text to Google’s Gemini API (user-provided key) with a category-optimized prompt (e.g., “Summarize this programming article in 3 key points…”).
Smart Features:
- Detects incomplete sentences and offers “Continue” or “Complete” actions.
- Renders summaries with smooth animations, dark/light mode, and scroll indicators.
Privacy-First: No data is stored or tracked. API keys are saved only in the user’s local Chrome sync storage.

The UI is built with vanilla JavaScript, CSS with gradients/animations, and a fully responsive modal that works on desktop and mobile.

Challenges I Faced

RTL Layout Support: Ensuring perfect rendering for Persian and Arabic required custom CSS direction handling and font fallbacks.
API Rate Limits: Balancing speed and cost by offering multiple Gemini models (Flash, Pro, Lite).
Prompt Engineering: Crafting category-specific prompts that yield accurate, non-hallucinated summaries across domains like medicine, crypto, and engineering.
Manifest V3 Restrictions: Migrating from background scripts to service workers while maintaining real-time responsiveness.

Despite these, the result is a fast, private, and powerful tool that turns the chaotic web into digestible knowledge.