PrivAgent Mail

logo
Product Workflow

Inspiration

We were inspired by the conflict everyone faces today: we want to use powerful AI tools to boost our productivity, but we're deeply concerned about sending sensitive company data or personal information to third-party servers. Reading an email summary is fast, but what if that summary service is logging your clients' names, internal project details, or personal addresses? We wanted to build a tool that gives you the "best of both worlds" — An AI-powered intelligent e-mail summarizer with zero privacy risk.

What it does

PrivAgent Mail delivers privacy-first email summarization inside Gmail through a four-stage zero-trust workflow. Every step runs locally until content is anonymized, so sensitive text never leaves the browser unprotected.

Hybrid on-device entity detection
Regex rules for addresses, IDs, and numbers combine with Compromise.js NLP plus contextual hints from headers, salutations, and signatures. A dynamic in-memory dictionary carries entities across threads, so nuanced PII is caught without any cloud calls.
Deterministic masking with reversible mapping
Detected tokens are swapped for scoped placeholders such as {{PERSON_1}} and stored in a transient map maintained by the background service worker. The original wording stays available for restoration without ever persisting sensitive data.
Schema-locked LLM summarization
Only the masked text travels to the configured OpenAI chat endpoint. Schema-guarded prompts force structured JSON—summary, action flags, suggested labels—while preserving placeholder integrity for a lossless round trip.
Agentic follow-through inside Gmail
Content scripts inject the summary bubble inline with Gmail threads, unmask the response locally, and surface suggested actions or label shortcuts. With the granted OAuth scopes, the extension can apply Gmail labels the moment the user confirms.

How we built it

PrivAgent Mail is built primarily with JavaScript, HTML, and CSS. The core architecture consists of:

Content Scripts: JavaScript that runs on the email provider's webpage (like Gmail) to read email content and inject the summary UI ("bubble").
Local Masking Engine: A custom-built engine using a combination of regular expressions (Regex) to catch common patterns (emails, phone numbers, credit cards) and a lightweighted NLP tool compromise.js regular expressions, contextual logic, and a dynamic dictionary to identify Personally Identifiable Information (PII). This engine creates a temporary, local "map" of original_text.
LLM API: A background script that manages the asynchronous fetch call to the OpenAI API (e.g., gpt-4o-mini for speed and efficiency), sending only the masked text and generating Json-schema-structured Json output.
Local Mapping Engine: This is the reverse of the masking engine. It parses the AI-generated summary and uses the local "map" to substitute the placeholders back with the original sensitive words, presenting a seamless, readable summary to the user.

Challenges we ran into

Balancing Masking Accuracy with Performance:

Finding the sweet spot for sensitive data detection on-device was incredibly difficult. Relying only on regex was fast but too rigid, missing contextual PII. We integrated compromise.js for lightweight NLP, but its out-of-the-box accuracy was a challenge; it would sometimes miss non-obvious entities or misclassify common words. We had to build a hybrid engine, to support compromise.js with our own contextual logic, all while ensuring the entire process ran instantly in the browser without lagging.

To solve the miss of non-obvious names or misclassify common words, we built a sophisticated hybrid engine (anonymizer.js). Before compromise.js even runs, our script applies contextual hints, scanning To/CC/BCC fields, greeting lines (like "Dear" or "Hello"), and known email signature patterns to find likely names. We even identify names with titles (like Dr. or Ms.) and extract potential organization names from non-generic email domains.

Furthermore, we built a series of heuristic filters (isLikelyPersonName) to reduce false positives, checking for capitalization and excluding generic terms like "team" or "customer." Finally, we implemented a dynamic dictionary to cache identified names, ensuring consistency across a single email and even across multiple replies in the same thread. Engineering this multi-layered system to run instantly without causing any browser lag was a significant hurdle.

Ensuring Placeholder Integrity:

It's not enough to just send masked text; we had to guarantee the AI's summary could be perfectly "un-masked" locally. In early tests, the AI would interrupt the completeness of placeholder, making restoration impossible. This forced us to use jsonSchema an design specific prompts to force the model to treat our placeholders as immutable tokens.

Accomplishments that we're proud of

We’re proud that PrivAgent Mail demonstrates how cutting-edge AI assistance and strict data privacy can truly coexist — efficiently, intelligently, and securely.

True On-Device Privacy Protection A full local anonymization pipeline detects, masks, and restores sensitive data entirely inside the browser. → No unprotected information ever leaves the device, realizing a zero-trust privacy model rarely achieved in LLM-based tools.
Schema-Guarded LLM Summarization Using prompt engineering and a strict JSON Schema, the model outputs deterministic, structured results. → Ensures placeholder integrity, consistent formatting, and safe automation — turning the LLM into a predictable summarization engine.
Robustness Across Real-World Emails Tested on diverse emails — invoices, threads, multilingual content — the system stays stable under paraphrasing and irregular formatting. → Maintains both privacy protection and semantic accuracy even in noisy, real-world conditions.
Lightweight and Efficient Integration All NLP steps run within a Chrome extension, optimized through Regex + Compromise.js pipelines. → Delivers instant, memory-efficient performance proving that privacy-first AI can be lightweight and deployable.
Intelligent and Automated Agent Workflow The system autonomously performs classification, intent detection, and action suggestion, forming a full loop — Detect → Anonymize → Summarize → Act. → Enables one-click productivity while keeping data completely secure.
Seamless and Customizable User Experience Summaries and suggestions appear directly inside the inbox UI, ensuring natural, frictionless interaction. → A modular design allows domain-specific adaptation (e.g., legal, finance, academic), keeping the system personalizable and future-proof.

What we learned

Through the development of PrivAgent Mail, we not only deepened our understanding of front-end engineering but also explored how privacy-preserving AI systems can be both practical and elegant.

Full-Stack Browser Engineering We mastered how to combine JavaScript, HTML, and CSS with browser APIs to build a real-time, responsive Chrome extension capable of intercepting, analyzing, and rewriting live email content safely. → Learned to design complex client logic that runs efficiently within user-facing browser environments.
Privacy-by-Design Architecture We gained hands-on experience in on-device anonymization pipelines, discovering how privacy can be enforced at the architectural level — not as an afterthought. → Built a foundation for “zero-trust” AI design where no sensitive data ever leaves the client unmasked.
Prompt Engineering & Schema Design We learned how careful prompt engineering and structured JSON schema outputs can transform large language models into deterministic, rule-following agents. → Enabled predictable summarization while preserving placeholders and enabling downstream automation.
Real-World Robustness and Testing By iterating on diverse email datasets, we learned how to handle noise — from messy HTML and signatures to multilingual text and nested threads. → Strengthened our system’s robustness, stability, and cross-context reliability.

What’s next for PrivAgent Mail

We see PrivAgent Mail not just as a plugin, but as the foundation for a privacy-first communication ecosystem. The next phase expands both functionality and reach.

Cross-Platform & Client Expansion We plan to extend PrivAgent to more ecosystems — including Microsoft Outlook, Yahoo Mail, QQ Mail, and enterprise clients. → Browser support will expand beyond Chrome to Firefox and Safari, with further exploration of mobile versions for iOS and Android to make privacy-preserving summarization available anywhere users work.
Smart “Private Reply” Generation Building upon our anonymization workflow, we will introduce Private Reply Generation. Users can compose prompts like:

“Tell [PERSON_1] I’ll have the [PROJECT_X] documents ready by Friday.” The agent masks sensitive tokens, sends the anonymized prompt for drafting, and restores it locally — producing a polished, ready-to-send reply that never exposes private data. → Enables context-aware, secure, AI-assisted communication directly inside the inbox.

Adaptive Domain Intelligence We aim to tailor PrivAgent for specialized domains such as legal correspondence, healthcare records, and corporate finance, where confidentiality is critical. → Each domain will have its own optimized rule sets, entity dictionaries, and schema templates for even higher accuracy.
Collaborative Privacy Framework Looking forward, we hope to open-source part of our on-device anonymization library, inviting collaboration from the privacy and AI research community. → Promotes transparency, verifiability, and collective improvement of privacy-preserving AI techniques.