Butter Mail: How It Was Built, How the AI/ML Works, and Why It’s Private

How the app was created

Butter Mail is an Electron desktop app. The main process (main.js) handles IMAP (via imapflow and mailparser), all ML work, and file I/O. The renderer (browser) runs the UI (app.js, index.html, style.css) and talks to the main process through a preload script (preload.js) that exposes a small electronAPI (context isolation, no nodeIntegration in the renderer). Views include a timeline (threads built from Message-Id / In-Reply-To / References) and a 3D graph (PCA-projected embeddings in Three.js). There are no generative LLM APIs (no OpenAI, Anthropic, etc.). The “AI” is local ML only: embeddings, sparse + dense search, clustering, and PCA—all running on your machine.

Why it’s private

No cloud AI: No text is sent to external LLM or embedding APIs. All embedding and search logic runs in the Electron main process. Local embedding model: The app uses @huggingface/transformers with Xenova/all-MiniLM-L6-v2 (384‑dim, sentence-transformers style). The model is loaded and run locally (via ONNX in Node), so email content never leaves the machine. IMAP and data flow: Mail is fetched over IMAP; credentials are stored only in Electron userData (see below). Email text is used only for local indexing, embedding, and search. So “private” here means: no third‑party AI services, no sending of email content to the internet—only your own IMAP server and local computation.

How each AI/ML feature was created (criteria and implementation)

Embeddings (embeddings-service.js) Purpose: Turn each email (subject + body) into a fixed-size vector so you can do similarity search and clustering. Criteria: Use a small, fast model that runs in Node; no API keys; same representation for emails and for the search query. Implementation: Xenova/all-MiniLM-L6-v2 via @huggingface/transformers (feature-extraction pipeline). Input: subject + plain text from body (HTML stripped, length capped ~512 tokens). Output: 384‑dim vectors, mean-pooled and normalized. Batched (e.g. 8) for progress reporting. Used for: (a) indexing all emails, (b) encoding the search query, (c) prompt-based clusters (query embedding vs. email embeddings). Hybrid search (search-service.js) Purpose: Search emails by meaning (dense) and by keywords (sparse), then combine results. Criteria: No external search API; work with the local embedding model; combine sparse and dense in a standard way. Implementation: Sparse: wink-bm25-text-search on title (subject) and body, with simple tokenization (lowercase, non‑word stripped). Dense: Cosine similarity between the query embedding (from the same MiniLM model) and each email’s embedding. Fusion: RRF (Reciprocal Rank Fusion) with a fixed k (e.g. 60) so that one ranking doesn’t dominate. Fallback when both fail: simple substring match on subject/body. All runs in the main process; the UI calls search:hybrid and displays the merged, scored list. Automatic clustering (clustering.js) Purpose: Group emails into categories without fixed labels. Criteria: Unsupervised, no fixed number of clusters; robust to noise. Implementation: DBSCAN (density-clustering) on the 384‑dim embeddings. Defaults: eps = 0.6, minPts = 2 (suitable for normalized vectors). Output: cluster IDs per email + “noise”; clusters get generic names and fixed colors. Results are used for the “subtab” filter (all / cluster-1 / cluster-2 / … / Uncategorized). Prompt-based clusters (in search-service.js + UI) Purpose: Let the user define a cluster by a phrase (e.g. “refund” or “newsletter”) and see which emails are similar—without calling any LLM. Criteria: Use only the existing local embeddings; no generative model; user controls inclusion (e.g. by threshold or by toggling). Implementation: The prompt is embedded with the same MiniLM pipeline; cosine similarity is computed against all email embeddings. The UI gets a scored list (embeddings:promptClusterScored), then the user sets a similarity threshold and can include/exclude individual emails. The resulting set is saved as a named “prompt cluster” and shown as another subtab (e.g. “prompt-refund”). 3D graph / PCA (pca-utils.js) Purpose: Visualize the embedding space in 3D so clusters and similarity are interpretable. Criteria: Reduce 384‑dim to 3D in a principled way; reuse the same projection when filtering (e.g. by category). Implementation: ml-pca: fit PCA on the current embedding matrix, project to 3 components, and optionally save the model (mean, scale, etc.) so that when the user filters (e.g. one cluster), the same model can project the subset and keep 3D positions consistent. Points are rendered in the graph view with Three.js.

Local storage and on-disk data

Renderer (browser) – localStorage (in app.js): All of this is in the Electron renderer’s origin, so it’s effectively per-app, local to the machine. butter-mail-emails – List of emails (from IMAP and/or import). butter-mail-embeddings – Map of email ID → 384‑dim embedding array. butter-mail-categories – DBSCAN results: { assignments, meta } (which email is in which cluster, plus names/colors). butter-mail-pca – Serialized PCA model (for 3D projection). butter-mail-pca-points – Email ID → 3D point for the graph. butter-mail-prompt-clusters – User-defined prompt clusters: each entry has a label and the list of email IDs in that cluster. So: emails, embeddings, categories, PCA state, and prompt clusters are all kept in localStorage so they persist across sessions and avoid re-fetching/re-computing until the user refreshes or clears data. Main process – disk (in main.js): IMAP config: The only file written by the app is imap-config.json in Electron’s userData directory (e.g. %AppData%/butter-mail on Windows). It holds IMAP host, port, user, and password so the app can connect to the mail server without re-entering credentials. No email content or embeddings are written to disk—only this config.

Built With

Share this project:

Updates