VoiceGuard — Self-Improving AI Compliance Monitor
Inspiration
Every day, millions of customer interactions happen over voice channels — sales calls, support tickets, insurance claims, financial advising. Companies deploy AI voice agents at scale, but here's the problem: who watches the AI?
Traditional compliance monitoring is reactive — a human reviews a random 2% sample of calls days after they happen. By then, the damage is done: a rogue agent promised something it shouldn't have, a frustrated customer churned, or a deepfake slipped through. We realized the same self-improving agent architecture the hackathon challenges us to build is exactly what's missing in voice compliance.
The inspiration clicked when we connected three dots: Modulate's Velma can hear what's really happening in a conversation (emotion, intent, deception — not just words), Airia can orchestrate autonomous agent workflows that adapt without human intervention, and Lightdash can surface the patterns that tell an agent where to improve. Together, they form a closed feedback loop — an agent that monitors, learns, and gets better at monitoring. That's not just automation. That's a self-improving system.
What it does
VoiceGuard is an autonomous AI compliance monitor that listens to voice conversations in real-time, detects risks, and continuously improves its own detection capabilities — without human intervention.
The core loop works in four stages:
Listen — Audio from customer calls streams through Modulate's Velma API, which analyzes emotion (anger, frustration, stress), detects toxicity and harassment, identifies deepfake audio, and transcribes with context-aware accuracy. Velma doesn't just hear words — it understands tone, intent, and behavioral red flags that text-only systems miss entirely.
Reason — Airia orchestrates the compliance agent pipeline. When Velma flags a high-stress interaction or potential policy violation, Airia routes the analysis through a multi-step reasoning chain: classify the violation type, assess severity, cross-reference against company compliance rules, and determine the appropriate action (alert, escalate, log, or coach).
Track — Every analysis result flows into Lightdash dashboards that visualize compliance health in real-time. Emotion trends over time, violation categories by agent, satisfaction scores by call type, false positive rates — all auto-updating and embeddable. This isn't just reporting; it's the agent's own performance scorecard.
Improve — Here's where self-improvement happens. The agent periodically queries its own Lightdash metrics via the API. When it detects degrading accuracy (rising false positives), emerging patterns (a new type of violation it's been missing), or shifts in baseline behavior, it autonomously adjusts its detection thresholds and reasoning prompts through Airia's pipeline configuration. No human tunes the knobs — the agent reads its own dashboards and adapts.
Real-world example: VoiceGuard notices that its "frustrated customer" detection has a 30% false positive rate on calls from a specific product line. It queries Lightdash, identifies that these calls have naturally higher emotional intensity because the product is complex, autonomously creates a calibrated threshold for that segment via Airia, and reduces false positives to 8% — all without a human touching the system.
How we built it
Architecture:
- Backend: FastAPI (Python) serving as the orchestration hub
- Voice Intelligence: Modulate Velma API for audio analysis — emotion detection, toxicity scoring, deepfake detection, and transcription
- Agent Orchestration: Airia Python SDK (
pip install airia) for pipeline execution, with separate pipelines for analysis, reasoning, and self-improvement cycles - Analytics & BI: Lightdash REST API for programmatic dashboard creation, SQL query execution, and embedded analytics with JWT-signed URLs
- LLM Reasoning: Claude API for the compliance reasoning and self-improvement decision-making layers
- Frontend: React + Tailwind CSS dashboard showing real-time analysis feed, embedded Lightdash visualizations, and the agent's self-improvement log
Key integration patterns:
- Modulate audio analysis results are structured as JSON events and piped into Airia pipelines as
userInput - Airia pipeline responses trigger Lightdash SQL inserts to update the metrics warehouse
- A scheduled self-improvement cycle queries Lightdash's v2 SQL endpoint, feeds metrics into a dedicated Airia "improvement" pipeline, and logs every autonomous adjustment with full reasoning traces
- The entire system runs as a single deployable service with no manual intervention required after startup
Challenges we ran into
Modulate API access was the biggest puzzle. Velma's enterprise API isn't publicly self-serve yet, so we reverse-engineered the preview demo at preview.modulate.ai using Chrome DevTools to capture the actual endpoint and request format. This took creative problem-solving but actually demonstrated a core hackathon skill — working with what's available under time pressure.
Closing the self-improvement loop without it spiraling. An agent that can modify its own behavior could theoretically over-correct and degrade. We implemented guardrails: the improvement pipeline can only adjust thresholds within bounded ranges, every change is logged with the reasoning chain that triggered it, and there's a rollback mechanism if post-change metrics drop below baseline. Getting these safety boundaries right in a 5-hour window was intense.
Connecting three very different API paradigms. Modulate is audio-in/analysis-out, Airia is pipeline orchestration with SSE streaming, and Lightdash is a BI query engine with async result fetching. Making these three talk to each other seamlessly required careful async handling and a unified event schema that all three could work with.
Time management as a solo builder. With 5.5 hours of coding time, every architectural decision had to be right the first time. We planned the full system architecture before writing a single line of code, which meant the implementation phase was focused and efficient rather than exploratory.
Accomplishments that we're proud of
A genuinely self-improving system, not just a demo. VoiceGuard doesn't just claim to self-improve — you can watch it happen. The Lightdash dashboard shows the before/after metrics for every autonomous adjustment the agent makes. During our demo, the agent completed three self-improvement cycles, each measurably improving its detection accuracy.
Deep integration of all three sponsor tools. This isn't a project that bolted on sponsor APIs as an afterthought. Modulate is the ears, Airia is the brain, Lightdash is the memory — remove any one and the system doesn't function. Each tool plays to its actual strength rather than being forced into a role.
The feedback loop is real. The agent queries its own performance dashboards and makes autonomous decisions based on what it finds. This is the kind of architecture that scales — the same pattern works for any domain where an AI agent needs to monitor, evaluate, and improve its own behavior.
Production-grade reasoning traces. Every compliance decision includes a full audit trail: what Velma detected, how Airia's pipeline reasoned about it, what the final determination was, and why. This isn't just a hackathon demo — it's the kind of explainability that enterprises actually need for regulatory compliance.
What we learned
Voice is an underserved modality in AI agents. Most agent frameworks focus on text and tool-calling. But voice carries vastly more information — emotion, stress, deception, intent — that text analysis misses entirely. Modulate's Ensemble Listening Model approach of using specialized sub-models for each signal type is architecturally elegant and produces richer signals than any transcript-then-analyze pipeline could.
Orchestration platforms are the missing layer. Building a self-improving agent from scratch means writing a lot of boilerplate: routing, error handling, guardrails, versioning. Airia abstracts exactly the right things — you focus on the agent logic and let the platform handle execution, failover, and governance. For production AI agents, this kind of orchestration isn't optional.
BI tools are powerful agent memory systems. We'd never thought of using a BI platform as an agent's self-awareness layer before this hackathon. But it makes perfect sense: Lightdash gives the agent structured, queryable access to its own historical performance. It's more useful than vector stores for this use case because the agent needs aggregated trends, not raw retrieval.
Self-improvement needs boundaries. An unconstrained self-modifying system is dangerous. The most important engineering decision we made was defining what the agent can't change about itself — the compliance rules, the escalation triggers for serious violations, and the logging requirements. Self-improvement works best within a well-defined sandbox.
What's next for VoiceGuard
Real-time streaming integration. The current prototype processes audio files; the production version will tap into Modulate's Twilio Media Streams integration for live call monitoring with sub-second latency. The architecture already supports this — the Airia pipeline just needs a streaming input adapter.
Multi-agent expansion. Right now VoiceGuard is a single agent loop. The next version will deploy specialized sub-agents — one focused on fraud detection patterns, one on customer satisfaction signals, one on regulatory compliance language — all coordinated through Airia's orchestration layer and reporting to a unified Lightdash dashboard.
Benchmark dataset and open evaluation. We want to publish a standardized benchmark for voice compliance monitoring, using synthetic call data, so other teams can compare approaches. The self-improvement metrics we track in Lightdash would become the evaluation framework.
Enterprise pilot. VoiceGuard addresses a real market need — the global speech analytics market is projected to hit $5.1 billion by 2027. We're exploring partnerships with contact center platforms to deploy VoiceGuard as an always-on compliance layer that gets smarter with every call it monitors.
The vision: Every AI voice agent deployed in production has a VoiceGuard watching over it — an autonomous compliance system that improves as fast as the agents it monitors. Not just reactive safety, but proactive, self-evolving trust infrastructure.
Built With
- airia-sdk
- anthropic-claude-api
- docker
- fastapi
- lightdash-api
- modulate-velma-api
- node.js
- pillow
- postgresql
- python
- react
- tailwind-css
- vite
Log in or sign up for Devpost to join the conversation.