Inspiration

We've all done it - clicked "I agree" on a 50-page Terms of Service without reading a single word. Whether it's Netflix's terms, a new app's privacy policy, or a rental agreement, legal documents are designed to be overwhelming. The average person encounters dozens of these documents yearly but reads less than 1% of them.

The problem isn't just laziness - it's accessibility. Legal jargon, dense formatting, and sheer length make these documents intentionally difficult to parse. Meanwhile, buried in those pages could be clauses that:

  • Auto-renew subscriptions at higher rates
  • Allow unlimited data collection
  • Waive your right to sue
  • Grant platforms the right to terminate your account without notice

We asked ourselves: What if you could understand any legal document in seconds, without leaving the page you're on?


What It Does

VoiceLegal AI is a three-part solution that makes legal documents accessible to everyone:

1. Chrome Extension (The Game Changer)

  • Detects legal documents automatically (terms, privacy policies, contracts)
  • Floating button appears on the page
  • One click → instant analysis, no tab switching, no PDF downloads
  • Works on any website: Netflix, Google, social media platforms, SaaS products

2. AI-Powered Analysis

Powered by Google's Gemini 2.0 Flash via Vertex AI, our system provides:

Executive Summary: 2-3 sentence overview of what you're agreeing to

Risk Assessment:

  • 🔴 HIGH RISK - Mandatory arbitration, unlimited liability waivers
  • 🟡 MEDIUM RISK - Unilateral modification rights, broad data usage
  • 🟢 LOW RISK - Standard administrative terms

Key Terms: Bullet-point breakdown of main provisions

Consumer Warnings: What to watch out for (auto-renewals, cancellation fees, data retention)

Hidden Clauses: Easy-to-miss terms that could hurt you later

3. Voice Assistant (Powered by ElevenLabs)

Here's what sets us apart from ChatGPT or Gemini:

  • Conversational AI that already knows your document
  • Ask follow-up questions naturally: "What happens if I cancel early?"
  • Clarify confusing clauses: "Explain this arbitration clause in simple terms"
  • Real-time voice responses - no typing needed
  • Context-aware - references the specific document you uploaded

How to Test VoiceLegal AI

⚠️ Important: Wake Up the Backend First

We're using free hosting to keep this project accessible:

  • Frontend: Vercel (always active)
  • Backend: Render (free tier with auto-sleep)

Render's free tier shuts down the server after 15 minutes of inactivity. Please follow these steps:

Step 1: Wake Up the Backend Server

  1. Visit: https://voicelegal-ai-web-extension.onrender.com
  2. You'll see: {"detail":"Not Found"} ← This is expected! It means the server is now awake ✅
  3. Wait ~30 seconds for full initialization

Step 2: Use the Web Dashboard

  1. Visit: https://voicelegal-ai-web-extension.vercel.app
  2. Upload any PDF legal document
  3. Get instant AI analysis
  4. Click "Start Voice Chat" to talk with the AI assistant

📥 Installing the Chrome Extension

Since we're unable to pay the Chrome Web Store developer fee, the extension must be installed manually as an unpacked extension.

Installation Steps:

1. Download the Extension

# Clone the repository
git clone https://github.com/varunsonawane/voicelegal-ai-web-extension.git

# Navigate to extension folder
cd voicelegal-ai-web-extension/extension

Or Download ZIP from GitHub and extract.

2. Load Extension in Chrome

  1. Open Chrome and go to: chrome://extensions/

  2. Enable Developer Mode

    • Toggle the switch in the top-right corner
  3. Load Unpacked Extension

    • Click "Load unpacked" button
    • Navigate to the voicelegal-ai-web-extension/extension folder
    • Select the folder and click "Select Folder"
  4. Verify Installation

    • You should see "VoiceLegal AI" in your extensions list
    • The extension icon should appear in your toolbar

3. Test the Extension

  1. Visit any legal document page, for example:

  2. Look for the floating button on the bottom-right

    • It appears automatically on detected legal pages
    • Shows a 🔍 icon
  3. Click "Analyze This Page"

    • Analysis appears in sidebar within 3-5 seconds
    • Scroll through the risk assessment
  4. Click "Open Voice Assistant"

    • Opens dashboard in new tab with analysis pre-loaded
    • Click "Start Voice Chat" to begin conversation
    • Grant microphone permission when prompted
    • Ask questions like: "What are the high-risk clauses?"

📝 Note for Judges

If the backend is sleeping when you test:

  1. Simply visit the backend URL once to wake it
  2. Wait 30 seconds
  3. All features will work normally

This is a limitation of free hosting - in production, we'd use a paid tier with zero downtime.


Why VoiceLegal AI Beats ChatGPT, Gemini & Other AI Platforms

The Current Problem with Existing AI Solutions:

Sure, you could use ChatGPT or Gemini to analyze legal documents. But here's what you'd have to do:

With ChatGPT/Gemini:

  1. ❌ Find the document link or download the PDF
  2. ❌ Open a new tab to ChatGPT/Gemini
  3. ❌ Upload the PDF or copy-paste the entire text
  4. ❌ Type out your questions manually
  5. ❌ Switch back and forth between tabs
  6. ❌ Re-upload or re-paste for every new document
  7. ❌ No context persistence between sessions

With VoiceLegal AI:

  1. Automatic detection - extension recognizes legal pages instantly
  2. Zero downloads - works directly on any webpage
  3. One click - floating button appears, no tab switching
  4. Pre-loaded context - voice assistant already knows your document
  5. Hands-free - just speak your questions naturally
  6. Seamless flow - analyze → ask → get answers in seconds
  7. No copy-pasting - works on live web pages, not just PDFs

Real-World Example:

Scenario: You're signing up for a new SaaS product and want to understand their terms.

Traditional AI Approach (ChatGPT/Gemini):

1. Find "Terms of Service" link → 30 seconds
2. Open in new tab → 5 seconds
3. Download PDF or copy text → 45 seconds
4. Open ChatGPT → 10 seconds
5. Upload/paste content → 30 seconds
6. Type question → 20 seconds
7. Read response → 1 minute
8. Switch tabs to compare → 10 seconds

Total time: ~4 minutes (and you still haven't signed up)

VoiceLegal AI Approach:

1. Extension detects page → automatic
2. Click floating button → 1 second
3. Read analysis → 30 seconds
4. Click "Voice Assistant" → 1 second
5. Ask question by voice → 5 seconds
6. Get instant answer → 10 seconds

Total time: ~47 seconds (4.8x faster!)

The Killer Features:

🚀 Zero Context Switching - Everything happens right where you are
🎯 Automatic Page Detection - No manual work required
🎤 Voice-First Interface - Speak naturally, no typing needed
Instant Analysis - Results in 3-5 seconds, not minutes
🔒 Privacy-Focused - No data stored permanently
🌐 Universal Compatibility - Works on ANY website with legal text

Bottom Line: VoiceLegal AI doesn't just analyze documents better - it removes every point of friction that makes people skip reading terms in the first place.


How We Built It

Architecture

Backend (FastAPI + Google Cloud Run)

# Core Analysis Pipeline
1. PDF extraction via PyPDF2
2. Text preprocessing (15,000 char limit)
3. Gemini 2.5 Flash analysis with structured prompt
4. Risk categorization algorithm
5. RESTful API endpoints for extension + dashboard

Frontend (React + Vite + Tailwind CSS)

  • Document upload interface with drag-and-drop
  • Markdown rendering for analysis results
  • ElevenLabs React SDK integration for voice
  • Responsive design for mobile/desktop

Chrome Extension (Manifest V3)

// Extension Flow
1. Content script detects legal pages (regex pattern matching)
2. Extracts page text via DOM traversal
3. Background script proxies API calls (bypass CORS)
4. Injects sidebar with analysis
5. "Voice Assistant" button → opens dashboard with pre-loaded context

Key Technical Decisions

Challenge: Content scripts can't fetch to localhost (ERR_BLOCKED_BY_CLIENT)
Solution: Background script as API proxy with message passing

Challenge: localStorage is domain-specific (can't share data between extension and dashboard)
Solution: Backend temporary storage with UUID-based retrieval, 5-minute expiry

Challenge: ElevenLabs webhook needs public URL (localhost not accessible)
Solution: Deployed backend to Google Cloud Run, webhook calls production API

Challenge: Service workers sleep after 30 seconds
Solution: Keepalive heartbeat every 20 seconds, retry logic with exponential backoff

Challenge: Voice assistant needs document context
Solution: Store analysis in backend, pass conversation ID to ElevenLabs webhook tool


Technologies Used

AI/ML

  • Google Vertex AI (Gemini 2.5 Flash) - Document analysis
  • ElevenLabs Conversational AI - Voice agent with custom tools

Backend

  • FastAPI - RESTful API framework
  • PyPDF2 - PDF text extraction
  • Google Cloud Run - Serverless deployment
  • Render - Production backend hosting
  • Python 3.11

Frontend

  • React 18 - UI framework
  • Vite - Build tool
  • Tailwind CSS - Styling
  • Axios - HTTP client
  • @elevenlabs/react - Voice SDK

Chrome Extension

  • Manifest V3 - Extension framework
  • Content Scripts - Page interaction
  • Background Service Worker - API proxy
  • Message Passing - Cross-context communication

DevOps

  • Docker - Containerization
  • Google Cloud Build - CI/CD
  • Vercel - Frontend hosting
  • Render - Backend hosting

Challenges We Faced

1. Chrome Extension CORS Nightmares

Problem: Content scripts blocked from fetching localhost due to Chrome security policies.

Attempts:

  • ❌ Direct fetch from content script → ERR_BLOCKED_BY_CLIENT
  • ❌ Inline script injection → CSP violations
  • ❌ localStorage data transfer → Domain isolation

Solution: Background script architecture with chrome.runtime.sendMessage() for API proxy.

Learning: Chrome's security model is complex but well-designed. Working with it (not against it) leads to better architecture.


2. ElevenLabs Webhook Integration

Problem: Voice agent couldn't access document context. Localhost webhooks unreachable from ElevenLabs servers.

Journey:

Attempt 1: Send context in first message → Hit 500-char limit
Attempt 2: Store in localStorage → Not accessible across domains  
Attempt 3: Use sessionStorage → Same issue
Attempt 4: Backend webhook with localhost → 404 from ElevenLabs
Solution: Cloud Run deployment + webhook tool configuration

Learning: Always plan for production architecture early. Localhost-only solutions create technical debt.


3. Service Worker Lifecycle Management

Problem: Background service workers terminate after 30 seconds, breaking API calls.

Symptoms:

  • "Could not establish connection. Receiving end does not exist"
  • Intermittent failures on button clicks
  • Lost message handlers

Solution:

// Keepalive heartbeat
setInterval(() => {
  console.log('Service worker heartbeat');
}, 20000);

// Retry logic with timeout
const sendWithRetry = async (retries = 3) => {
  for (let i = 0; i < retries; i++) {
    try {
      return await chrome.runtime.sendMessage(message);
    } catch (error) {
      if (i === retries - 1) throw error;
      await new Promise(r => setTimeout(r, 1000));
    }
  }
};

Learning: Manifest V3 service workers require active lifecycle management. Design for failure scenarios.


4. Real-time Data Synchronization

Problem: Extension analyzes document → user clicks "Voice Assistant" → dashboard needs same data.

Requirements:

  • Cross-domain transfer
  • No permanent storage
  • One-time use (security)
  • 5-minute expiry

Architecture:

Extension → Background Script → Backend (UUID storage)
                                      ↓
                                   Dashboard fetches by UUID
                                      ↓
                                   Data deleted after retrieval

Learning: In-memory storage with TTL is perfect for ephemeral data transfer. Simple and secure.


What We Learned

Technical Skills:

  • Chrome Extension Manifest V3 architecture
  • Google Cloud Run deployment with Docker
  • Vertex AI Gemini API integration
  • ElevenLabs Conversational AI webhooks
  • Cross-origin communication patterns
  • Service worker lifecycle management

Product Development:

  • User experience is everything - the extension removes all friction
  • Voice interfaces require careful context management
  • MVP should work end-to-end, then optimize
  • Production deployment reveals issues that localhost testing misses

AI Integration:

  • Structured prompts dramatically improve output quality
  • Gemini 2.0 Flash is fast enough for real-time analysis (<3 seconds)
  • Voice AI is surprisingly accessible with modern SDKs
  • Context window management is critical for long documents

What's Next

Short-term:

  • Multi-language support (Spanish, French, German)
  • Firefox extension port
  • Save analysis history
  • Share analysis links with friends
  • Browser-native notifications for risky clauses

Long-term:

  • Mobile app (iOS/Android)
  • Enterprise version for contract review
  • Integration with DocuSign, HelloSign
  • Comparison mode (compare two documents)
  • Legal expert review marketplace

Impact

VoiceLegal AI democratizes access to legal understanding. Whether you're:

  • A student signing up for software trials
  • A parent reviewing app privacy policies
  • A freelancer negotiating contracts
  • Anyone accepting online terms

You deserve to know what you're agreeing to - without needing a law degree.

Our extension makes this possible with zero effort. No downloads, no copy-pasting, no leaving the page. Just click, read, and ask questions.

Because legal literacy should be a right, not a privilege.


Try it now and stop accepting terms blindly! 🚀

Built With

Share this project:

Updates