CitizenOne: The Bureaucracy Killer
💡 The Inspiration
We live in an era of "Agentic AI," yet citizens are still forced to act as manual data routers. Whether applying for a visa, a business permit, or a simple address change, we re-type the same verified information into disconnected, legacy portals.
I was inspired by the concept of Sovereign Data. Why should a form take 30 minutes of manual typing when an AI can "map" the intent of a field to a local vault in milliseconds? I wanted to build a tool that wasn't just another chatbot, but an Action-Oriented Agent that works for free, respects privacy, and runs on a mobile phone.
🏗️ How I Built It
The core challenge was moving away from expensive "Vision" models (which require high-end GPUs or costly API credits) to a Semantic DOM Mapping approach.
- The Brain (BYOK): I built a unified
ProviderFactoryallowing users to "Bring Your Own Key." This supports Gemini 1.5 Flash (free tier), Groq (for lightning-fast inference), and NVIDIA NIM. - The Vault: Security was paramount. I implemented a local-first encryption layer using
chrome.storage.local. - The Logic: Instead of sending raw HTML, the extension extracts a "Semantic Map."
- Zero-Knowledge Loop: The AI only sees labels (e.g., "Given Name"). It never sees the user's actual data. The matching happens locally on the user's device.
🧠 The Math of Efficiency
To quantify the impact, we can model the total time saved as the difference between manual entry time for fields and the AI's constant processing overhead : $$T_{saved} = \left( \sum_{i=1}^{n} t_{manual,i} \right) - t_a$$Where $t_a \approx 0.8s$ when using high-speed inference like Groq, whereas manual entry $\sum t_{manual}$ scales linearly with bureaucracy. By optimizing the "DOM Stripping" process, we ensure the input token count $C$ stays within free-tier limits:$$C_{tokens} \ll \text{Limit}_{free_tier}$$
🛑 Challenges I Faced
- Shadow DOMs & Iframes: Many government sites use archaic structures that hide input fields. I had to write a recursive injection script to "pierce" these layers.
- Token Optimization: Sending full HTML to an LLM is expensive. I implemented Tree Shaking for the DOM—removing scripts, styles, and empty divs to create a "Lite Map" that fits into the smallest context windows.
- Mobile Compatibility: Ensuring the extension worked on Kiwi Browser required a "Bottom-Sheet" UI, as traditional sidebars break on small vertical screens.
🎓 What I Learned
I learned that the most powerful AI tools aren't those with the most parameters, but those with the most agency. Building CitizenOne taught me that privacy-preserving AI is possible by separating Reasoning (in the cloud) from Data (on the local device).
Log in or sign up for Devpost to join the conversation.