Inspiration

Hundreds of insurance advisory calls happen every day — yet evaluating whether agents truly follow the script, disclose the right information, and speak authentically is still done manually, inconsistently, and slowly.

We asked: what if AI could read the rulebook and grade every call automatically?

That became VoiceIQ.


What it does

Two things, end to end:

Turn documents into a scoring rulebook. Upload any insurance product documentation and Qwen extracts the key facts, synthesizes a structured Sales Rule — product highlights, target customer profile, consultation steps, and measurable pass/fail criteria — ready for the team to review and save.

Score any recorded call against that rulebook. Upload audio, and VoiceIQ returns a full report: what the agent said, who said what, how well the consultation matched the rule, and whether the voice itself was real or AI-generated.


How we built it

Everything runs on Qwen models via DashScope. Different models handle different stages — document understanding, audio transcription, speaker role identification, content evaluation, and linguistic analysis — working together as a single pipeline.

For voice authenticity, we built a multi-signal detection engine that combines acoustic analysis with Qwen-powered linguistic fingerprinting, catching TTS-generated voices without any training data.


Challenges we ran into

  • Making Qwen score strictly and consistently — not generously — required deep prompt design and server-side guardrails the model cannot override
  • Separating who said what from a flat transcript, across any number of speakers in any order, using only language understanding
  • Detecting AI voices with no labelled dataset — derived analytically from two independent signal types, fused together

Accomplishments that we're proud of

  • A pipeline that goes from raw audio to a scored, explainable report with no human in the loop
  • Scores that are mathematically tamper-proof regardless of what the model outputs
  • An AI voice detector built with zero training data
  • Qwen returns verdicts in both Vietnamese and English in a single response

What we learned

Qwen is genuinely strong at Vietnamese speech nuance and domain-specific insurance terminology. Getting it to be a strict grader rather than a generous one turned out to be the hardest prompt engineering challenge of the project.


What's next for VoiceIQ

Real-time scoring on live calls, a manager dashboard for team-wide trends, and auto-generated post-call coaching — so every agent improves before the next conversation.

Built With

Share this project:

Updates