Inspiration

Every day, millions of people face information overload — long articles, technical documents, news in foreign languages, or incomplete content that wastes precious time. As someone who reads content in Persian, English, and Arabic, I often struggled to quickly grasp the core ideas of a webpage without reading every word. This frustration inspired me to build Gemini Page Insights: an AI-powered Chrome extension that instantly transforms any webpage into a clear, concise, and intelligent summary — in under 5 seconds.

I wanted to create a tool that democratizes access to knowledge, especially for students, non-native speakers, busy professionals, and language learners who don’t have the luxury of time or fluency.

What I Learned

Building this extension taught me how to:

  • Integrate Google’s Gemini AI models directly into a browser environment while respecting privacy.
  • Use Readability.js to extract clean, semantic content from any webpage — stripping away ads, sidebars, and clutter.
  • Design a responsive, RTL-compatible UI that works flawlessly for both left-to-right (LTR) and right-to-left (RTL) languages like Persian and Arabic.
  • Implement Manifest V3 best practices with service workers, secure storage, and efficient background processing.
  • Optimize prompts for 9 specialized content categories (e.g., Medical, Programming, News) to dramatically improve summary relevance.

How I Built It

Gemini Page Insights is built as a Chrome extension using Manifest V3. Here’s the architecture:

  1. Content Extraction: When the user clicks the floating ✨ button, the extension uses Readability.js to isolate the main article content.
  2. Language Detection: Automatically detects the page language (supports 13+ languages).
  3. AI Processing: Sends the cleaned text to Google’s Gemini API (user-provided key) with a category-optimized prompt (e.g., “Summarize this programming article in 3 key points…”).
  4. Smart Features:
    • Detects incomplete sentences and offers “Continue” or “Complete” actions.
    • Renders summaries with smooth animations, dark/light mode, and scroll indicators.
  5. Privacy-First: No data is stored or tracked. API keys are saved only in the user’s local Chrome sync storage.

The UI is built with vanilla JavaScript, CSS with gradients/animations, and a fully responsive modal that works on desktop and mobile.

Challenges I Faced

  • RTL Layout Support: Ensuring perfect rendering for Persian and Arabic required custom CSS direction handling and font fallbacks.
  • API Rate Limits: Balancing speed and cost by offering multiple Gemini models (Flash, Pro, Lite).
  • Prompt Engineering: Crafting category-specific prompts that yield accurate, non-hallucinated summaries across domains like medicine, crypto, and engineering.
  • Manifest V3 Restrictions: Migrating from background scripts to service workers while maintaining real-time responsiveness.

Despite these, the result is a fast, private, and powerful tool that turns the chaotic web into digestible knowledge.

Built With

  • chrome-extension-manifest-v3
  • chrome-storage-api
  • css
  • google-gemini-api
  • html
  • javascript
  • readability.js
  • rtl/ltr
  • service-workers
  • ui
Share this project:

Updates