Inspiration

I spend a lot of time bouncing between tabs, trying to make sense of dense webpages, PDFs, and forms. English isn't my first language, so things like tax forms or government pages can feel like a wall of text. I kept wishing I could just ask a quick question out loud and get a helpful answer right where I was.

So I built Mentelo—a Chrome extension that brings an AI helper into the page you're already on. No copy-paste. No tab switching. And yes, you can talk to it. Mentelo is the tool I wanted while filling out forms, learning from long articles, and writing in a second language.

What it does

Mentelo has two UIs that work side by side: a floating menu and a side panel.

The floating menu includes common useful features like summarize page, translate, text-to-speech, chat with page, and call Mindy. The side panel includes a detailed interface for chat with page, summarize page, call Mindy, clipboard history, temp bookmarks, and a settings tab where you can set up your own API key, choose a preferred language, and select a voice model.

Mentelo also has built-in text suggestions with an AI button. In any text field, an AI button appears where you can either select part of the text to improve your writing or use the button to modify the entire text field.

I developed a user interface resembling native YouTube designs that offers video summarization options, including TL;DR, Takeaways, and Notes. Additionally, I implemented a right-click context menu applicable to any website, allowing users to translate highlighted text or extract text from images.

Overview of Mentelo's features:

Turn any page into a conversation.

  • Fixes your writing as you type
  • Rewrites text to be clearer, more casual, more formal, or shorter
  • Translates any page and text on a website while maintaining the original text layout and design
  • Summarizes long pages and YouTube videos so you get the point fast
  • Lets you chat with webpages and PDFs—ask any questions you have by talking
  • Extracts text from images and describes images as text
  • TTS uses the best audio model to create natural speech
  • And my favorite: you can talk to it like a person

"Call Mindy"—the voice assistant

  • You speak. Mindy replies in real time.
  • Mindy knows what's on the page you're viewing, so answers are grounded.
  • Prefer typing? That works too. Switch anytime.

Built for the real web

  • Works on articles, forms, PDFs, images, and videos
  • Helps organize temporary bookmarks for a browsing session
  • One-click translate for entire pages or selected text

How we built it

First, I started with a simple prompt, asking AI to build me a Chrome extension using the Google Gemini API key. The extension would summarize pages, translate pages and text, chat with pages, and call Mindy (live audio with AI), as well as include other useful features like a copied text/image list and temp bookmarks. I then described what I wanted it to look like: a floating menu and side panel that wouldn't interfere with the page viewing experience.

It took me a few iterations to get what I liked.

Initially, I used the Gemini API to set everything up properly so I knew the extension worked with the API. After most features were done, I started introducing the Google Built-in AI feature into the extension. I researched the available built-in APIs and integrated Google Built-in AI as an option, so the original API would still work.

I wanted speed and privacy by default, with power when you need it. Mentelo uses a hybrid approach:

  • Built-in Chrome AI (Gemini Nano) for fast, private, no-cost tasks
  • Cloud AI for heavy lifting like vision, voice, and advanced multimodal tasks
  • A smart fallback system that picks what's best for the job and switches if something is unavailable

You can also set your own preference.

The AI models used in this extension depend on whether you choose Google Built-in AI or the Gemini Cloud API.

# Feature Provider API/Model
1 Grammar Check Built-in + Cloud Proofreader API + Prompt API / Gemini 2.5 Flash Lite
2 Translation Built-in + Cloud Translator API / Gemini 2.5 Flash Lite
3 Summarization Built-in + Cloud Summarizer API / Gemini 2.5 Flash Lite
4 Text Rewriting Built-in + Cloud Rewriter API + Prompt API / Gemini 2.5 Flash Lite
5 Content Generation Built-in + Cloud Prompt API (LanguageModel) / Gemini 2.5 Flash Lite
6 Chat with Page Built-in + Cloud Prompt API / Gemini
7 YouTube Summary Built-in + Cloud Summarizer API / Gemini
8 Text Field Assistant Built-in + Cloud Proofreader + Rewriter + Prompt APIs / Gemini
9 Page Translation Built-in + Cloud Translator API / Gemini
10 Page Summarization Built-in + Cloud Summarizer API / Gemini
11 Image Text Extraction (OCR) Cloud Only Gemini 2.0 Flash (Vision)
12 Image Explanation Cloud Only Gemini 2.0 Flash (Vision)
13 PDF Chat Cloud Gemini 2.0 Flash + OCR
14 Call Mindy (Voice Assistant) Cloud Only Gemini 2.5 Flash Native-Audio
15 Text-to-Speech (TTS) Cloud Only Gemini TTS

Challenges we ran into

PDF support is tricky, especially with scanned files. Originally, I wanted to use text extraction from PDFs inside Chrome, but I realized that PDFs don't extract text like a normal website. So I decided to use OCR, which in the end works as intended.

I also had challenges with live grammar checking and suggestion highlights. The suggestion highlights don't follow the text as I scroll the page. I'm still experiencing issues with suggestion highlights showing up outside the text field on some webpages.

Keeping the UI light but capable is a balance: powerful when needed, invisible when not. I found that semi-transparency works quite well on websites—it feels more like native UI than a third-party plugin.

Accomplishments that we're proud of

The entire project was built in a week with the help of AI. This is my first time creating a Chrome extension, and I'm quite happy with how it turned out—the features I can use on a daily basis. This is a huge achievement for me. Learning how to use APIs in a Chrome extension was a huge step up for me too.

What we learned

During development, I learned how to use the Gemini API and Built-in AI APIs to create tools that connect to a browser. I also learned how to use the OCR method to extract text from images for text processing, how to break large chunks of text into smaller chunks for smaller AI models, and how to use Google Built-in AI to create offline AI models that work for better privacy at no extra cost.

What's next for Mentelo—An Interactive Chrome Extension You Can Talk To

I wish to continue working on this Chrome extension, hoping one day I can bring it to the extension store where it can help many other users like me. I plan to improve the features, fix more bugs, and improve the UI based on user experience.

Built With

Share this project:

Updates