VoxGuard
Real-time phone call deepfake and scam detection that shields users by intercepting threats before they reach the conversation.
How It Works
Caller dials your shield number
│
▼
┌───────────────┐ ┌──────────────────┐ ┌───────────────┐
│ Telnyx │────▶│ VoxGuard Server │────▶│ Your Phone │
│ (Inbound) │ │ │ │ (Forwarded) │
└───────────────┘ └──────────────────┘ └───────────────┘
│ │
┌──────────────┘ └──────────────┐
▼ ▼
┌─────────────────┐ ┌──────────────────┐
│ Deepfake Detect │ │ Scam Detect │
│ (DF Arena + LoRA)│ │ (Vertex AI │
│ 4s sliding window│ │ Gemini 2.5 Pro) │
│ every 0.5s │ │ 15s audio chunks│
└─────────────────┘ └──────────────────┘
│ │
└──────────┐ ┌──────────────────┘
▼ ▼
Threat detected?
│ │
No Yes
│ │
▼ ▼
Call continues ┌─────────────────────┐
normally │ ISOLATION │
│ ─ Unbridge caller │
│ ─ Decoy AI engages │
│ attacker │
│ ─ User notified via │
│ AI agent │
└─────────────────────┘
All calls route through a shield number assigned per user. The caller and user are bridged with live audio streaming. If the deepfake detector (3 consecutive chunks above threshold) or scam detector (confidence >= 0.7) triggers, the call is isolated: the attacker is unbridged and handed to an ElevenLabs decoy agent, while the user is notified by a separate agent.
Architecture
| Layer | Technology |
|---|---|
| Backend | FastAPI, Uvicorn, async Python 3.11 |
| Frontend | Next.js 16 (static export), React 19, TypeScript, Tailwind CSS 4 |
| Database | PostgreSQL 16, SQLAlchemy 2.0 (async), Alembic migrations |
| Deepfake Detection | DF Arena 1B (LoRA fine-tuned), custom inference API |
| Scam Detection | Vertex AI fine-tuned Gemini 2.5 Pro with function calling |
| Telephony | Telnyx Voice API (webhooks + WebSocket audio streaming) |
| AI Agents | ElevenLabs Conversational AI (decoy + user notification) |
| Billing | Stripe (subscriptions, checkout, customer portal) |
| Email / SMS | Brevo (transactional email, OTP verification) |
| Geolocation | Numvalidate (phone lookup) + Google Maps Geocoding |
| Deployment | Docker (multi-stage), Docker Compose, Railway |
Features
- Real-time deepfake voice detection -- 4-second sliding window analysis every 0.5 seconds, isolation on 3 consecutive high-confidence detections
- Real-time scam detection -- 15-second audio chunks analyzed by fine-tuned Gemini model across 10 scam categories (IRS, tech support, romance, bank fraud, etc.)
- Automatic call isolation -- attacker unbridged and redirected to a decoy AI agent that keeps them engaged
- Per-user shield numbers -- each user gets a dedicated Telnyx number; callers dial the shield number and are transparently forwarded
- Live dashboard -- SSE-powered real-time call monitoring with status badges, confidence meters, and call timelines
- Threat map -- Google Maps heatmap of caller origins weighted by threat type
- Call history -- paginated records with full audio playback (attacker and user tracks), geolocation, and detection details
- Per-user detection toggles -- independently enable/disable deepfake and scam detection
- Subscription billing -- free tier (10 calls/month) and Pro plan via Stripe with 30-day trial
- Phone verification -- SMS OTP via Brevo with rate limiting
- Caller geolocation -- phone number lookup via Numvalidate + Google Maps geocoding
Project Structure
voxguard/
├── backend/
│ ├── app.py # Main FastAPI app (webhooks, WebSocket, API routes)
│ ├── auth.py # JWT auth, registration, login
│ ├── models.py # SQLAlchemy models (User, CallRecord, NumberPool)
│ ├── database.py # Async PostgreSQL connection
│ ├── detector.py # Deepfake detection API client
│ ├── scam_detector.py # Vertex AI scam classification
│ ├── agents.py # ElevenLabs agent bridge (WebSocket ↔ audio)
│ ├── audio_utils.py # μ-law ↔ PCM16 ↔ WAV conversion
│ ├── phone_lookup.py # Numvalidate + Google Maps geolocation
│ ├── telnyx_numbers.py # Shield number provisioning & pool management
│ ├── email_service.py # Brevo transactional email
│ ├── verification.py # SMS OTP verification
│ ├── tts.py # ElevenLabs TTS utility
│ ├── requirements.txt
│ └── migrations/ # Alembic schema migrations (7 versions)
├── frontend/
│ ├── src/app/
│ │ ├── page.tsx # Landing page (hero, pricing, how it works)
│ │ ├── dashboard/ # Real-time call monitoring dashboard
│ │ ├── account/ # User settings & subscription management
│ │ ├── login/ # Authentication
│ │ ├── register/ # Registration + phone OTP flow
│ │ └── components/ # 19 React components
│ ├── package.json
│ └── next.config.ts # Static export configuration
├── deepfake-detection/
│ ├── prepare_data_1k.py # Data pipeline (LibriSpeech + Whisper + Replicate TTS)
│ ├── train_lora_1k_aug.py # LoRA fine-tuning with phone-call augmentation
│ ├── augment_phone.py # Audio degradation (G.711, GSM, noise, reverb, packet loss)
│ ├── eval_held_out.py # Held-out evaluation
│ └── README.md # ML pipeline documentation
├── scam-detection/
│ ├── generate_scam_calls.py # Generate 100 scam conversations via ElevenLabs
│ └── prepare_vertex_finetune.py # Prepare JSONL + launch Vertex AI fine-tuning
├── Dockerfile # Multi-stage build (Node.js frontend + Python backend)
├── docker-compose.yml # App + PostgreSQL services
└── .env.example # Environment variable template
Setup
Prerequisites
- Python 3.11+
- Node.js 20+
- PostgreSQL 16+
- Docker & Docker Compose (for containerized setup)
Environment Variables
Copy the example and fill in your keys:
cp .env.example .env
Key variables:
| Variable | Description |
|---|---|
TELNYX_API_KEY |
Telnyx Voice API key |
TELNYX_CONNECTION_ID |
Telnyx SIP connection ID |
PUBLIC_WSS_URL |
Public WebSocket URL for audio streaming (e.g., wss://your-domain/telnyx/ws) |
ELEVEN_API_KEY |
ElevenLabs API key |
ELEVEN_SCAMMER_AGENT_ID |
ElevenLabs decoy agent ID |
ELEVEN_USER_AGENT_ID |
ElevenLabs user notification agent ID (deepfake) |
ELEVEN_SCAM_USER_AGENT_ID |
ElevenLabs user notification agent ID (scam) |
DETECTOR_API_URL |
Deepfake model inference endpoint |
FAKE_THRESHOLD |
Spoof score threshold (default: 0.8) |
DATABASE_URL |
PostgreSQL connection string |
JWT_SECRET |
Secret for JWT signing |
GOOGLE_SERVICE_ACCOUNT_JSON |
GCP service account JSON (for Vertex AI scam detection) |
BREVO_API_KEY |
Brevo API key (email + SMS) |
STRIPE_SECRET_KEY |
Stripe secret key |
STRIPE_PRICE_ID |
Stripe subscription price ID |
WEBHOOK_SECRET |
Stripe webhook signing secret |
NUMVALIDATE_API_KEY |
Phone number lookup API key |
GOOGLE_MAPS_API_KEY |
Google Maps Geocoding API key |
SITE_URL |
Frontend URL (default: https://voxguard.org) |
Docker (recommended)
docker compose up --build
This starts the FastAPI backend (with the frontend static export bundled in) on port 8000 and PostgreSQL on port 5432.
Local Development
Backend:
cd backend
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
alembic upgrade head
uvicorn app:app --host 0.0.0.0 --port 8000 --reload
Frontend:
cd frontend
npm install
npm run dev
The frontend dev server runs on http://localhost:3000 with Turbopack. For production, the frontend is statically exported (next build) and served by the backend.
API Overview
Authentication
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/auth/register |
Register (email, password, phone) |
| POST | /api/auth/login |
Login |
| POST | /api/auth/logout |
Logout |
| GET | /api/auth/me |
Current user profile |
| POST | /api/auth/verify-phone |
Verify phone OTP |
| POST | /api/auth/resend-code |
Resend OTP (60s cooldown) |
| POST | /api/auth/retry-provision |
Retry shield number provisioning |
Calls & Dashboard
| Method | Endpoint | Description |
|---|---|---|
| GET | /events |
SSE stream (live call updates) |
| GET | /api/calls |
Paginated call history |
| GET | /api/calls/{id} |
Call detail with timeline |
| GET | /api/calls/{id}/audio/{track} |
Stream call audio (attacker or user) |
| GET | /api/stats |
Dashboard statistics |
| GET | /api/map-points |
Threat map geolocation data |
User Management
| Method | Endpoint | Description |
|---|---|---|
| PATCH | /api/user/phone |
Update phone number |
| PATCH | /api/user/detection-settings |
Toggle deepfake/scam detection |
| DELETE | /api/user/account |
Delete account |
Billing
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/auth/create-checkout |
Create Stripe Checkout session |
| GET | /api/auth/usage |
Monthly usage + plan limits |
| POST | /api/stripe/webhook |
Stripe event handler |
| POST | /api/stripe/portal |
Stripe Customer Portal URL |
Webhooks
| Method | Endpoint | Description |
|---|---|---|
| POST | /telnyx/webhook |
Telnyx call lifecycle events |
| WS | /telnyx/ws |
Bidirectional audio streaming |
ML Models
Deepfake Detection
- Base model: Speech-Arena-2025/DF_Arena_1B_V_1 (1.15B parameters)
- Fine-tuning: LoRA (r=8, alpha=16, dropout=0.1, all-linear layers, ~10M trainable params)
- Training data: 1,000 real samples (LibriSpeech, 123 speakers) + 1,000 synthetic samples (Qwen3-TTS voice clones via Replicate)
- Augmentation: Phone-call degradation (G.711 μ-law, GSM codec, white/pink noise, band-pass filter, room reverb, packet loss)
- Results: 100% accuracy and F1 on val/test sets; EER 0.246 with augmentation
- Inference: 4-second sliding window, 0.5s stride, isolation after 3 consecutive detections above threshold
Scam Detection
- Model: Vertex AI fine-tuned Gemini 2.5 Pro with function calling
- Training data: 100 synthetic scam conversations generated via ElevenLabs Text-to-Dialogue (21 voices, 10 scam categories)
- Categories: IRS/tax, tech support, prize/lottery, bank fraud, investment/crypto, romance, charity, insurance/medicare, job offer, utility service
- Inference: 15-second MP3 audio chunks sent to Vertex AI; isolation on confidence >= 0.7
Open Source
We publish two artifacts from this project on Hugging Face:
- gereon/voxguard-synthetic-speech -- Synthetic dataset containing deepfake audio samples (Qwen3-TTS voice clones of LibriSpeech speakers) and scam conversation recordings (ElevenLabs Text-to-Dialogue across 10 categories)
- gereon/voxguard-lora -- LoRA fine-tune of DF Arena 1B for voice deepfake detection, trained with phone-call audio augmentation (100% accuracy, EER 0.246)
Integrations
| Service | Purpose |
|---|---|
| Telnyx | Inbound call handling, audio streaming, number provisioning, call bridging |
| ElevenLabs | Conversational AI agents (decoy for attackers, notification for users) |
| Stripe | Subscription billing, checkout, customer portal |
| Brevo | Transactional email (welcome), SMS (OTP verification, coupons) |
| Google Vertex AI | Scam detection model hosting and inference |
| Google Maps | Caller geolocation geocoding + dashboard threat heatmap |
| Numvalidate | Phone number validation, carrier lookup, location |
Built With
- cloudflare
- elevenlabs
- gemini
- railway
- redhat-openshiftai
- replicate
- stripe
- vertexai
Log in or sign up for Devpost to join the conversation.