ChromeAI Studio
Your personal AI workspace that runs everywhere you browse. ChromeAI Studio adds an assistant bubble, a powerful sidebar, and context-aware actions for text selection, full‑page understanding, and voice.
Inspiration
- Bring on‑device AI (Chrome Prompt API + Gemini Nano) to everyday browsing
- Reduce context friction: act on what you’re reading, selecting, or saying without copy/paste
- Be adaptable: student, developer, creator, or researcher modes with sensible defaults
- Make powerful AI safe-by-default in the browser UI (no copy of data off-page for on‑device flows)
- Design an assistant that augments, not interrupts, with lightweight UI that respects the page
What it does
- Floating action bubble to open the Studio and quick actions
- Smart Sidebar for chat, tools, mentions, and settings
- Text‑selection menu with explain/rewrite/translate/summarize
- Whole‑page summarization with URL→Markdown and streaming output
- Voice assistant with MCP providers and optional autonomous agent mode
- Wake‑word and continuous conversation (when voice mode is enabled)
- Role modes change prompts and tool defaults per context (study/code/write/research)
- Cross‑tab sync prevents competing voice sessions; compact toasts show progress and errors
- Zero‑indent rendering and bullet collapsing keep results readable in narrow panels
How we built it
- Chrome Extension (Manifest V3)
- Uses Chrome’s on‑device AI Prompt API via
ai/api-wrapper.jsandservices/ai-service.js - UI modules in
ui/*(floating bubble, text selection, sidebar) - Content extraction pipeline: readability + turndown + custom table converter
- Role modes stored in
localStoragewith cross‑tab sync - Voice: Web Speech (STT/TTS) + MCP voice agent interface; user‑gesture gates for on‑device model setup
- Streaming architecture with throttled updates (~16ms) and final flush
- Defensive URL→Markdown with per‑site guards; fallback to plain text when needed
- Strict sanitization pass (link pruning, bullet flattening, overflow control)
Challenges we ran into
- On‑device AI components require a user gesture during initial download/setup
- Complex DOMs (news, app UIs) can break naïve HTML→Markdown; we added guards and fallbacks
- Streaming UX in the sidebar while keeping layout stable and readable
- Wake‑word + STT coordination without echo/duplication during TTS
- Handling mixed stream chunk types (strings vs Uint8Array) in a single decoder path
- Keeping CSS from the host page from bleeding into sidebar components
Accomplishments that we're proud of
- Reusable AI Manager with streaming + non‑streaming paths
- Fast, readable summaries with paragraph‑first style for small pages
- Robust voice flow: MCP first, Chrome AI fallback, then extension AI
- Clean, minimal sidebar rendering with bullet flattening and zero‑indent lists
- Gesture‑gate utility that transparently retries model creation after a single click
- Modular UI that works on most sites without layout collisions
What we learned
- Always prepare a gesture fallback for Prompt API creation
- Do HTML→Markdown defensively and keep a plain‑text extraction fallback
- Stream early and often; throttle UI updates for smoothness
- Small formatting fixes (bullets, links, line‑wrap) dramatically improve perceived quality
- Explicit error taxonomy → clearer UX: permission vs availability vs network vs model
What’s next for ChromeAI Studio
- More MCP tools, richer sidebar apps, multi‑page research sessions
- Per‑site tuning for better summary structure and link selection
Built With
- javascript
- speechrecognition
Log in or sign up for Devpost to join the conversation.