ELV — Voice-First Website Support

Inspiration

Customer support is broken. Live chat feels robotic, knowledge bases go unread, and hiring 24/7 agents is expensive. We asked: what if visitors could just talk to your website?

Voice is the most natural interface humans have — yet no embeddable support tool uses it as the primary channel. ELV was born to change that.

## What it does

ELV is a voice-first embeddable support widget — think Intercom, but powered by conversation instead of text bubbles.

A business adds a single <script> tag to their site
Visitors click the widget and speak their question out loud
An AI voice agent answers in real-time, using the website's own content as its knowledge base (RAG)
Business owners manage everything — crawled pages, agent behavior, analytics — through a dashboard

## How we built it

ELV is a full-stack monorepo built with Turborepo + pnpm:

| Layer | Stack |
|-------|-------|
| Widget | Preact + TypeScript — lightweight loader.js + iframe SPA | | Voice Agent | Python + LiveKit Agents SDK — real-time WebRTC voice |
| STT / TTS | Deepgram Nova-3 (speech-to-text) + Cartesia Sonic-3 (text-to-speech) |
| LLM + RAG | GPT-4o-mini + OpenAI embeddings + pgvector for retrieval |
| API | FastAPI + Alembic migrations + PostgreSQL 16 |
| Ingestion | Playwright crawler + Redis Queue workers for embedding generation |
| Dashboard | Next.js 14 (App Router) + Clerk auth + Tailwind + shadcn/ui |

The architecture separates concerns cleanly: the widget handles UI, the API handles auth and data, and the agent service handles real-time voice + RAG independently.

## Challenges we faced

Real-time voice latency — Keeping the round-trip (speech → STT → LLM → TTS → audio) under 1.5s required careful pipeline optimization and streaming at every stage.
RAG quality — Chunking web pages for retrieval is deceptively hard. Too large and context gets diluted; too small and answers lack coherence. We iterated heavily on chunk sizing and overlap.
Widget embedding — Making a widget that works on any website without style conflicts meant strict iframe isolation and a postMessage bridge for communication.
Crawler reliability — Real-world websites are messy. SPAs, auth walls, infinite scrolls — the ingestion pipeline needed robust error handling and retry logic.

## What we learned

Voice UX is fundamentally different from chat UX — silence feels broken, so we added filler audio and visual feedback to keep the experience feeling alive.
pgvector with proper indexing handles RAG retrieval surprisingly well at our scale.
Preact's 3KB footprint was the right call for an embeddable widget — every kilobyte matters when you're loading on someone else's site.

## What's next for ELV

Multi-language support (agent speaks the visitor's language)
Conversation analytics and intent clustering in the dashboard
Custom voice cloning so businesses can have a branded voice
Shopify / WordPress one-click install plugins

Built With

cartesia
clerk
cloudflare
css
deepgram
fastapi
livekit
next.js
openai
pages
pgvector
playwright
postgresql
preact
python
redis
shadcn/ui
tailwind
turborepo
typescript
vercel

Updates

dinesh buddy started this project — Apr 04, 2026 02:22 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.