📬 Project Story — CodePal

đź’ˇ Inspiration

Actually for years, I've been personally frustrated by the digital ritual of context-switching: sign-up, flip to Gmail, hunt for the code, copy, flip back, paste. It's a minor task, but its constant repetition is a major drag on focus and flow state.

When I learned about Gemini Nano and its ability to run powerful AI models directly and privately within the browser, I had a moment of clarity. This was the perfect technology to solve this exact problem. The idea was electric: what if I could build a tool that was not only smart but also respected user privacy completely?

And so, CodePal was born.


🏗️ How I Built It

UI & Frontend:

Built with pure HTML, CSS, and JavaScript, featuring a minimalist popup interface for immediate access to the latest code.

The core isn't just detecting if Gmail is open; a lightweight Content Script (gmail-monitor.js) runs directly on the Gmail page, actively monitoring the inbox for changes in real-time.

Email Access & Parsing:

The Content Script locally observes changes to the Gmail DOM to detect new, unread emails. This ensures that zero email data ever leaves the user's browser.

The script performs an initial pre-screening using keywords like “code,” “OTP,” or “verification,” only passing highly relevant email content to the extraction engine.

The Three-Tier Intelligent Extraction Engine:

I engineered a three-tier system to balance speed, privacy, and accuracy:

Tier 1: Local Regex Engine (<50ms) - For lightning-fast extraction of codes from standard, predictable email formats.

Tier 2: Gemini Nano (On-Device AI) - Using Chrome's built-in Prompt API, this tier processes more complex and varied email structures directly on the user's machine. It's the core of our privacy-first approach.

Tier 3: Gemini API (Cloud Fallback) - A powerful cloud-based fallback for the most challenging cases, ensuring maximum coverage. (Note: This tier was disabled by default for the hackathon build to focus on the on-device experience).

Proactive Autofill & Display:

The system has two outputs, the Popup displays the code for one-click copying. A second, globally-injected Content Script (otp-intent-listener.js) detects when a user focuses on an OTP input field on any website, signaling to the background service to prepare for an incoming code. This enables a future of seamless, zero-touch autofill.


⚠️ Challenges

Reliably Detecting Emails in Gmail's Dynamic UI.

  • I Implemented a robust hybrid system combining a MutationObserver with a periodic setInterval polling fallback. This ensures that even with Gmail's asynchronous rendering, no new email is ever missed.

AI "Hallucinations" (Extracting the Wrong Numbers).

  • I Developed a heuristic post-processing layer in the service worker. After the AI extracts a code, this layer validates it against OTP length constraints and contextual keywords (e.g., ignoring "invoice numbers" but confirming "verification codes"), dramatically reducing false positives.

đź§  What I Learned

I truly enjoyed the process of crafting an idea into a complete, polished product, for real. This journey gave me a holistic feel for creating end-to-end value—from architectural decisions to the smallest interaction details—and it solidified my trust in AI. Witnessing it reliably extract the right answer from messy, complex emails, again and again, reinforced my belief that AI is a powerful and dependable partner for solving our everyday frictions.


🚀 Future Improvements

  • Autofill.
  • True Background Operation (OAuth2): The next major leap is integrating the Gmail API via secure OAuth2. This will allow CodePal to run invisibly in the background, without requiring an open Gmail tab.
  • Multimodal AI for Image OTPs: Leverage Gemini's multimodal capabilities to extract verification codes from images and screenshots within emails.

Built With

Share this project:

Updates