Inspiration
Every day, millions of people face information overload — long articles, technical documents, news in foreign languages, or incomplete content that wastes precious time. As someone who reads content in Persian, English, and Arabic, I often struggled to quickly grasp the core ideas of a webpage without reading every word. This frustration inspired me to build Gemini Page Insights: an AI-powered Chrome extension that instantly transforms any webpage into a clear, concise, and intelligent summary — in under 5 seconds.
I wanted to create a tool that democratizes access to knowledge, especially for students, non-native speakers, busy professionals, and language learners who don’t have the luxury of time or fluency.
What I Learned
Building this extension taught me how to:
- Integrate Google’s Gemini AI models directly into a browser environment while respecting privacy.
- Use Readability.js to extract clean, semantic content from any webpage — stripping away ads, sidebars, and clutter.
- Design a responsive, RTL-compatible UI that works flawlessly for both left-to-right (LTR) and right-to-left (RTL) languages like Persian and Arabic.
- Implement Manifest V3 best practices with service workers, secure storage, and efficient background processing.
- Optimize prompts for 9 specialized content categories (e.g., Medical, Programming, News) to dramatically improve summary relevance.
How I Built It
Gemini Page Insights is built as a Chrome extension using Manifest V3. Here’s the architecture:
- Content Extraction: When the user clicks the floating ✨ button, the extension uses
Readability.jsto isolate the main article content. - Language Detection: Automatically detects the page language (supports 13+ languages).
- AI Processing: Sends the cleaned text to Google’s Gemini API (user-provided key) with a category-optimized prompt (e.g., “Summarize this programming article in 3 key points…”).
- Smart Features:
- Detects incomplete sentences and offers “Continue” or “Complete” actions.
- Renders summaries with smooth animations, dark/light mode, and scroll indicators.
- Privacy-First: No data is stored or tracked. API keys are saved only in the user’s local Chrome sync storage.
The UI is built with vanilla JavaScript, CSS with gradients/animations, and a fully responsive modal that works on desktop and mobile.
Challenges I Faced
- RTL Layout Support: Ensuring perfect rendering for Persian and Arabic required custom CSS direction handling and font fallbacks.
- API Rate Limits: Balancing speed and cost by offering multiple Gemini models (Flash, Pro, Lite).
- Prompt Engineering: Crafting category-specific prompts that yield accurate, non-hallucinated summaries across domains like medicine, crypto, and engineering.
- Manifest V3 Restrictions: Migrating from background scripts to service workers while maintaining real-time responsiveness.
Despite these, the result is a fast, private, and powerful tool that turns the chaotic web into digestible knowledge.
Built With
- chrome-extension-manifest-v3
- chrome-storage-api
- css
- google-gemini-api
- html
- javascript
- readability.js
- rtl/ltr
- service-workers
- ui
Log in or sign up for Devpost to join the conversation.