x
Built With
- chrome-built-in-ai
x
x
Hello, I wanted to share a recent update with several major improvements. I didn’t include these changes in the main submission or the original GitHub repo because I wanted to keep the competition version unchanged.
Since this is a solution I plan to keep improving for my own use, I wanted to mention the latest enhancements. They are significant, and I have released them as a Version 2 in a separate repository.
One of the major technical innovations is our two-stage AI analysis system that dramatically improves privacy and reduces costs by keeping more processing on-device:
The Problem with Single-Stage Analysis: In a traditional single-stage approach, one AI call would need:
Entire HTML page (often 3000+ tokens) System prompt (~500 tokens) Custom instructions (~300 tokens) Personal knowledge base with user data (~2000-3000 tokens) Total: >6000 tokens → Forces processing to cloud (exceeds Gemini Nano limit) → User data sent to cloud → Privacy concerns + API costs
Our Two-Stage Solution:
Two-Stage Architecture:
Stage 1: HTML → Gemini 2.5 Flash → Form Structure (exact labels, types, IDs) Stage 2: Field Labels + Knowledge Base → Gemini 2.5 Flash → Exact Values Why This is Better for Privacy & Cost:
Stage 1: Form Structure Detection (~3500 tokens)
Input: HTML page only (no user data) Output: Extracted field labels, types, IDs Privacy: No sensitive user data involved, cloud processing is acceptable May go to cloud, but contains no private information Stage 2: Value Extraction (~2500 tokens)
Input: Only extracted field labels (not entire HTML) + knowledge base Key benefit: Removed massive HTML, greatly reduced token count Result: Much higher probability of staying under 6000 tokens Privacy: More likely to process on-device with Gemini Nano (user data stays local) Cost: Avoids cloud API calls when possible Benefits:
✅ Privacy: Stage 2 (with user data) more likely to stay on-device ✅ Cost savings: Fewer cloud API calls = lower costs ✅ Speed: On-device processing is faster (<200ms vs 2s) ✅ Exact field matching: Stage 2 returns {"Email Address": "value"} with exact field labels ✅ Better context: Stage 1 provides field metadata to Stage 2
Another key innovation is giving users full control over where their embeddings are stored.
Why This Matters:
Storage Comparison:
| Feature | Local Storage | Cloud Storage |
|---|---|---|
| Capacity | 10MB (~20 docs) | Unlimited (petabytes) |
| Privacy | 100% local | Data in Google Cloud |
| Speed | <10ms | ~500ms |
| Multi-device | ❌ No | ✅ Yes |
| Cost | Free | ~$0.02/GB/month |
The extension provides two voice input options:
Primary: Gemini Live API 2.5 (preferred when available)
Fallback: Web Speech API (when Gemini Live unavailable)
Leave feedback in the comments!
Log in or sign up for Devpost to join the conversation.
Log in or sign up for Devpost to join the conversation.