Landing Page
Dashboard

VoxGuard

Real-time phone call deepfake and scam detection that shields users by intercepting threats before they reach the conversation.

How It Works

Caller dials your shield number
        │
        ▼
┌───────────────┐     ┌──────────────────┐     ┌───────────────┐
│    Telnyx     │────▶│  VoxGuard Server  │────▶│   Your Phone  │
│  (Inbound)    │     │                   │     │  (Forwarded)  │
└───────────────┘     └──────────────────┘     └───────────────┘
                             │    │
              ┌──────────────┘    └──────────────┐
              ▼                                  ▼
     ┌─────────────────┐              ┌──────────────────┐
     │ Deepfake Detect  │              │   Scam Detect    │
     │ (DF Arena + LoRA)│              │ (Vertex AI       │
     │ 4s sliding window│              │  Gemini 2.5 Pro) │
     │ every 0.5s       │              │  15s audio chunks│
     └─────────────────┘              └──────────────────┘
              │                                  │
              └──────────┐    ┌──────────────────┘
                         ▼    ▼
                   Threat detected?
                    │           │
                   No          Yes
                    │           │
                    ▼           ▼
              Call continues  ┌─────────────────────┐
              normally        │     ISOLATION        │
                              │ ─ Unbridge caller    │
                              │ ─ Decoy AI engages   │
                              │   attacker           │
                              │ ─ User notified via  │
                              │   AI agent           │
                              └─────────────────────┘

All calls route through a shield number assigned per user. The caller and user are bridged with live audio streaming. If the deepfake detector (3 consecutive chunks above threshold) or scam detector (confidence >= 0.7) triggers, the call is isolated: the attacker is unbridged and handed to an ElevenLabs decoy agent, while the user is notified by a separate agent.

Architecture

Layer	Technology
Backend	FastAPI, Uvicorn, async Python 3.11
Frontend	Next.js 16 (static export), React 19, TypeScript, Tailwind CSS 4
Database	PostgreSQL 16, SQLAlchemy 2.0 (async), Alembic migrations
Deepfake Detection	DF Arena 1B (LoRA fine-tuned), custom inference API
Scam Detection	Vertex AI fine-tuned Gemini 2.5 Pro with function calling
Telephony	Telnyx Voice API (webhooks + WebSocket audio streaming)
AI Agents	ElevenLabs Conversational AI (decoy + user notification)
Billing	Stripe (subscriptions, checkout, customer portal)
Email / SMS	Brevo (transactional email, OTP verification)
Geolocation	Numvalidate (phone lookup) + Google Maps Geocoding
Deployment	Docker (multi-stage), Docker Compose, Railway

Features

Real-time deepfake voice detection -- 4-second sliding window analysis every 0.5 seconds, isolation on 3 consecutive high-confidence detections
Real-time scam detection -- 15-second audio chunks analyzed by fine-tuned Gemini model across 10 scam categories (IRS, tech support, romance, bank fraud, etc.)
Automatic call isolation -- attacker unbridged and redirected to a decoy AI agent that keeps them engaged
Per-user shield numbers -- each user gets a dedicated Telnyx number; callers dial the shield number and are transparently forwarded
Live dashboard -- SSE-powered real-time call monitoring with status badges, confidence meters, and call timelines
Threat map -- Google Maps heatmap of caller origins weighted by threat type
Call history -- paginated records with full audio playback (attacker and user tracks), geolocation, and detection details
Per-user detection toggles -- independently enable/disable deepfake and scam detection
Subscription billing -- free tier (10 calls/month) and Pro plan via Stripe with 30-day trial
Phone verification -- SMS OTP via Brevo with rate limiting
Caller geolocation -- phone number lookup via Numvalidate + Google Maps geocoding

Project Structure

voxguard/
├── backend/
│   ├── app.py                 # Main FastAPI app (webhooks, WebSocket, API routes)
│   ├── auth.py                # JWT auth, registration, login
│   ├── models.py              # SQLAlchemy models (User, CallRecord, NumberPool)
│   ├── database.py            # Async PostgreSQL connection
│   ├── detector.py            # Deepfake detection API client
│   ├── scam_detector.py       # Vertex AI scam classification
│   ├── agents.py              # ElevenLabs agent bridge (WebSocket ↔ audio)
│   ├── audio_utils.py         # μ-law ↔ PCM16 ↔ WAV conversion
│   ├── phone_lookup.py        # Numvalidate + Google Maps geolocation
│   ├── telnyx_numbers.py      # Shield number provisioning & pool management
│   ├── email_service.py       # Brevo transactional email
│   ├── verification.py        # SMS OTP verification
│   ├── tts.py                 # ElevenLabs TTS utility
│   ├── requirements.txt
│   └── migrations/            # Alembic schema migrations (7 versions)
├── frontend/
│   ├── src/app/
│   │   ├── page.tsx           # Landing page (hero, pricing, how it works)
│   │   ├── dashboard/         # Real-time call monitoring dashboard
│   │   ├── account/           # User settings & subscription management
│   │   ├── login/             # Authentication
│   │   ├── register/          # Registration + phone OTP flow
│   │   └── components/        # 19 React components
│   ├── package.json
│   └── next.config.ts         # Static export configuration
├── deepfake-detection/
│   ├── prepare_data_1k.py     # Data pipeline (LibriSpeech + Whisper + Replicate TTS)
│   ├── train_lora_1k_aug.py   # LoRA fine-tuning with phone-call augmentation
│   ├── augment_phone.py       # Audio degradation (G.711, GSM, noise, reverb, packet loss)
│   ├── eval_held_out.py       # Held-out evaluation
│   └── README.md              # ML pipeline documentation
├── scam-detection/
│   ├── generate_scam_calls.py # Generate 100 scam conversations via ElevenLabs
│   └── prepare_vertex_finetune.py  # Prepare JSONL + launch Vertex AI fine-tuning
├── Dockerfile                 # Multi-stage build (Node.js frontend + Python backend)
├── docker-compose.yml         # App + PostgreSQL services
└── .env.example               # Environment variable template

Setup

Prerequisites

Python 3.11+
Node.js 20+
PostgreSQL 16+
Docker & Docker Compose (for containerized setup)

Environment Variables

Copy the example and fill in your keys:

cp .env.example .env

Key variables:

Variable	Description
`TELNYX_API_KEY`	Telnyx Voice API key
`TELNYX_CONNECTION_ID`	Telnyx SIP connection ID
`PUBLIC_WSS_URL`	Public WebSocket URL for audio streaming (e.g., `wss://your-domain/telnyx/ws`)
`ELEVEN_API_KEY`	ElevenLabs API key
`ELEVEN_SCAMMER_AGENT_ID`	ElevenLabs decoy agent ID
`ELEVEN_USER_AGENT_ID`	ElevenLabs user notification agent ID (deepfake)
`ELEVEN_SCAM_USER_AGENT_ID`	ElevenLabs user notification agent ID (scam)
`DETECTOR_API_URL`	Deepfake model inference endpoint
`FAKE_THRESHOLD`	Spoof score threshold (default: `0.8`)
`DATABASE_URL`	PostgreSQL connection string
`JWT_SECRET`	Secret for JWT signing
`GOOGLE_SERVICE_ACCOUNT_JSON`	GCP service account JSON (for Vertex AI scam detection)
`BREVO_API_KEY`	Brevo API key (email + SMS)
`STRIPE_SECRET_KEY`	Stripe secret key
`STRIPE_PRICE_ID`	Stripe subscription price ID
`WEBHOOK_SECRET`	Stripe webhook signing secret
`NUMVALIDATE_API_KEY`	Phone number lookup API key
`GOOGLE_MAPS_API_KEY`	Google Maps Geocoding API key
`SITE_URL`	Frontend URL (default: `https://voxguard.org`)

Docker (recommended)

docker compose up --build

This starts the FastAPI backend (with the frontend static export bundled in) on port 8000 and PostgreSQL on port 5432.

Local Development

Backend:

cd backend
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
alembic upgrade head
uvicorn app:app --host 0.0.0.0 --port 8000 --reload

Frontend:

cd frontend
npm install
npm run dev

The frontend dev server runs on http://localhost:3000 with Turbopack. For production, the frontend is statically exported (next build) and served by the backend.

API Overview

Authentication

Method	Endpoint	Description
POST	`/api/auth/register`	Register (email, password, phone)
POST	`/api/auth/login`	Login
POST	`/api/auth/logout`	Logout
GET	`/api/auth/me`	Current user profile
POST	`/api/auth/verify-phone`	Verify phone OTP
POST	`/api/auth/resend-code`	Resend OTP (60s cooldown)
POST	`/api/auth/retry-provision`	Retry shield number provisioning

Calls & Dashboard

Method	Endpoint	Description
GET	`/events`	SSE stream (live call updates)
GET	`/api/calls`	Paginated call history
GET	`/api/calls/{id}`	Call detail with timeline
GET	`/api/calls/{id}/audio/{track}`	Stream call audio (`attacker` or `user`)
GET	`/api/stats`	Dashboard statistics
GET	`/api/map-points`	Threat map geolocation data

User Management

Method	Endpoint	Description
PATCH	`/api/user/phone`	Update phone number
PATCH	`/api/user/detection-settings`	Toggle deepfake/scam detection
DELETE	`/api/user/account`	Delete account

Billing

Method	Endpoint	Description
POST	`/api/auth/create-checkout`	Create Stripe Checkout session
GET	`/api/auth/usage`	Monthly usage + plan limits
POST	`/api/stripe/webhook`	Stripe event handler
POST	`/api/stripe/portal`	Stripe Customer Portal URL

Webhooks

Method	Endpoint	Description
POST	`/telnyx/webhook`	Telnyx call lifecycle events
WS	`/telnyx/ws`	Bidirectional audio streaming

ML Models

Deepfake Detection

Base model: Speech-Arena-2025/DF_Arena_1B_V_1 (1.15B parameters)
Fine-tuning: LoRA (r=8, alpha=16, dropout=0.1, all-linear layers, ~10M trainable params)
Training data: 1,000 real samples (LibriSpeech, 123 speakers) + 1,000 synthetic samples (Qwen3-TTS voice clones via Replicate)
Augmentation: Phone-call degradation (G.711 μ-law, GSM codec, white/pink noise, band-pass filter, room reverb, packet loss)
Results: 100% accuracy and F1 on val/test sets; EER 0.246 with augmentation
Inference: 4-second sliding window, 0.5s stride, isolation after 3 consecutive detections above threshold

Scam Detection

Model: Vertex AI fine-tuned Gemini 2.5 Pro with function calling
Training data: 100 synthetic scam conversations generated via ElevenLabs Text-to-Dialogue (21 voices, 10 scam categories)
Categories: IRS/tax, tech support, prize/lottery, bank fraud, investment/crypto, romance, charity, insurance/medicare, job offer, utility service
Inference: 15-second MP3 audio chunks sent to Vertex AI; isolation on confidence >= 0.7

Open Source

We publish two artifacts from this project on Hugging Face:

gereon/voxguard-synthetic-speech -- Synthetic dataset containing deepfake audio samples (Qwen3-TTS voice clones of LibriSpeech speakers) and scam conversation recordings (ElevenLabs Text-to-Dialogue across 10 categories)
gereon/voxguard-lora -- LoRA fine-tune of DF Arena 1B for voice deepfake detection, trained with phone-call audio augmentation (100% accuracy, EER 0.246)

Integrations

Service	Purpose
Telnyx	Inbound call handling, audio streaming, number provisioning, call bridging
ElevenLabs	Conversational AI agents (decoy for attackers, notification for users)
Stripe	Subscription billing, checkout, customer portal
Brevo	Transactional email (welcome), SMS (OTP verification, coupons)
Google Vertex AI	Scam detection model hosting and inference
Google Maps	Caller geolocation geocoding + dashboard threat heatmap
Numvalidate	Phone number validation, carrier lookup, location

Built With

cloudflare
elevenlabs
gemini
railway
redhat-openshiftai
replicate
stripe
vertexai

Updates

Felix Hadasch started this project — Feb 22, 2026 05:01 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.