ChromeAI Studio

ChromeAI Studio

Your personal AI workspace that runs everywhere you browse. ChromeAI Studio adds an assistant bubble, a powerful sidebar, and context-aware actions for text selection, full‑page understanding, and voice.

Inspiration

Bring on‑device AI (Chrome Prompt API + Gemini Nano) to everyday browsing
Reduce context friction: act on what you’re reading, selecting, or saying without copy/paste
Be adaptable: student, developer, creator, or researcher modes with sensible defaults
Make powerful AI safe-by-default in the browser UI (no copy of data off-page for on‑device flows)
Design an assistant that augments, not interrupts, with lightweight UI that respects the page

What it does

Floating action bubble to open the Studio and quick actions
Smart Sidebar for chat, tools, mentions, and settings
Text‑selection menu with explain/rewrite/translate/summarize
Whole‑page summarization with URL→Markdown and streaming output
Voice assistant with MCP providers and optional autonomous agent mode
Wake‑word and continuous conversation (when voice mode is enabled)
Role modes change prompts and tool defaults per context (study/code/write/research)
Cross‑tab sync prevents competing voice sessions; compact toasts show progress and errors
Zero‑indent rendering and bullet collapsing keep results readable in narrow panels

How we built it

Chrome Extension (Manifest V3)
Uses Chrome’s on‑device AI Prompt API via ai/api-wrapper.js and services/ai-service.js
UI modules in ui/* (floating bubble, text selection, sidebar)
Content extraction pipeline: readability + turndown + custom table converter
Role modes stored in localStorage with cross‑tab sync
Voice: Web Speech (STT/TTS) + MCP voice agent interface; user‑gesture gates for on‑device model setup
Streaming architecture with throttled updates (~16ms) and final flush
Defensive URL→Markdown with per‑site guards; fallback to plain text when needed
Strict sanitization pass (link pruning, bullet flattening, overflow control)

Challenges we ran into

On‑device AI components require a user gesture during initial download/setup
Complex DOMs (news, app UIs) can break naïve HTML→Markdown; we added guards and fallbacks
Streaming UX in the sidebar while keeping layout stable and readable
Wake‑word + STT coordination without echo/duplication during TTS
Handling mixed stream chunk types (strings vs Uint8Array) in a single decoder path
Keeping CSS from the host page from bleeding into sidebar components

Accomplishments that we're proud of

Reusable AI Manager with streaming + non‑streaming paths
Fast, readable summaries with paragraph‑first style for small pages
Robust voice flow: MCP first, Chrome AI fallback, then extension AI
Clean, minimal sidebar rendering with bullet flattening and zero‑indent lists
Gesture‑gate utility that transparently retries model creation after a single click
Modular UI that works on most sites without layout collisions

What we learned

Always prepare a gesture fallback for Prompt API creation
Do HTML→Markdown defensively and keep a plain‑text extraction fallback
Stream early and often; throttle UI updates for smoothness
Small formatting fixes (bullets, links, line‑wrap) dramatically improve perceived quality
Explicit error taxonomy → clearer UX: permission vs availability vs network vs model

What’s next for ChromeAI Studio

More MCP tools, richer sidebar apps, multi‑page research sessions
Per‑site tuning for better summary structure and link selection

Built With

javascript
speechrecognition

Updates

Pradyumn Chauhan started this project — Oct 30, 2025 11:49 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.