QAI | Devpost

💡 Inspiration

It started with a simple question: Why is quality assurance still so manual in a world driven by AI?

In call centre environments, quality is a core KPI, yet the process behind it is time-consuming and difficult to scale. Quality coaches, supervisors, and team leads spend hours picking calls, listening end-to-end, taking notes, re-listening, and manually scoring interactions, often limiting reviews to only a small sample of total calls. Important signals like rising frustration, missed empathy, or coaching opportunities are frequently buried deep in conversations and easy to overlook.

The result is incomplete visibility into agent performance and delayed feedback, not because teams don’t care about quality, but because the workload makes it impossible to audit every call. QAI was built to change that, not to replace quality coaches, but to help them focus on what matters most, review more calls with less effort, and deliver timely, data-driven coaching that actually improves performance.

🎯 What it does

QAI is an AI-powered customer service quality assurance platform that enables quality coaches and agents to understand, evaluate, and improve customer interactions. Coaches can access complete audio recordings of both the customer and agent, synchronized transcripts, and an interactive timeline that highlights key moments in each conversation.

QAI uses AI to continuously analyze conversations, identifying positive moments, challenges, uncertain interactions, and areas for improvement. These are automatically tagged and timestamped, allowing coaches to quickly access the most relevant parts of each recording without reviewing the entire call.

After each call, QAI generates a detailed QA report that scores agent performance using the company’s custom QA rubric and marking scheme. The report includes category-level scores such as empathy, clarity, resolution, compliance, and sales effectiveness, supported by transcript evidence. By aligning with the organization’s existing QA framework, QAI ensures consistent, fair, and transparent call evaluations.

In addition to post-call analysis, QAI offers real-time assistance during live calls. While agents speak with customers, the platform provides live suggestions and coaching prompts, such as reminders to acknowledge frustration, adjust tone, or clarify next steps. For sales interactions, QAI delivers contextual tips and proven sales methods to help agents address objections and guide conversations toward better outcomes.

By combining post-call insights with live coaching, QAI shifts quality assurance from a reactive review process to a proactive system. This approach helps agents improve performance in real time and provides quality coaches with tools to scale feedback and enhance service quality across teams.

How we built it

Frontend

Built with Next.js 16 (App Router) and React 19 + TypeScript
Styled using Tailwind CSS with Radix UI / shadcn-style components
Three.js (React Three Fiber) hero on the landing page with smooth Motion animations
Shared layout and AppSidebar across core pages (/upload, /qa, /live, /dashboard, /analytics)
Recharts for visualizing scores, trends, and radar charts
Forms handled with React Hook Form + Zod for validation
User feedback via Sonner toasts, Vaul drawers, and cmdk
Export functionality for PDF (jsPDF) and Excel (xlsx) reports

Backend

Server logic implemented using Next.js API Routes under app/api
No database to keep the system lightweight and fast
In-memory state for dashboards, reviews, and analytics
Temporary audio file storage using a shared file store
Environment-based configuration via .env.local
Deployed on Vercel with built-in analytics and monitoring

APIs

Soniox API for audio transcription and real-time streaming
- Used for file uploads, live calls, and real-time transcription
OpenAI API via the Vercel AI SDK
- Used for QA analysis, coaching notes, brainstorming, and tone/sentiment analysis
- Structured outputs validated using Zod
Clear separation between transcription (speech → text) and AI analysis (text → insights)

⚠️ Challenges we ran into

1. AI Hallucination & Consistency

Problem:
Using the OpenAI API, the model would occasionally produce inconsistent QA interpretations or overconfident classifications, especially when tone shifts were subtle. In some cases, tone was misinterpreted or marked incorrectly during neutral-to-frustrated transitions.

Solution:

Implemented strict schema validation to enforce consistent QA outputs
Added confidence thresholds for tone and sentiment classification
Introduced an “uncertain” category instead of forcing a good/bad label
Used structured prompting with explicit tone examples and QA rubrics

2. Tone Recognition Across Conversations

Problem:
While Soniox provided accurate real-time transcription, detecting emotional tone consistently throughout an entire conversation was challenging. Tone often changes gradually rather than instantly, making single-utterance analysis unreliable.

Solution:

Tracked tone and sentiment across rolling time windows
Analyzed pacing, interruptions, and keyword context together
Detected emotional drift instead of relying on isolated moments

3. Real-Time vs Post-Call Analysis

Problem:
Running transcription (Soniox), tone analysis, and QA scoring simultaneously during live calls introduced performance and latency constraints.

Solution:

Prioritized lightweight signals for real-time agent suggestions
Deferred deeper QA scoring and report generation to post-call processing
Separated live coaching logic from post-call QA pipelines

4. Version Control & Rapid Iteration

Problem:
Frequent commits on prompts for QA logic and UI components led to merge conflicts and inconsistent behaviour across branches.

Solution:

Established clearer branching and feature isolation
Separated AI prompt experimentation from core application logic
Used smaller, more frequent commits to reduce conflicts

5. Explainability & Trust

Problem:
Early QA outputs felt like a black box, which could reduce trust from quality coaches reviewing calls.

Solution:

Required transcript evidence for every QA flag
Paired scores with clear explanations and timestamps
Designed insights to support human reviewers, not replace them

🏆 Accomplishments that we're proud of

Built an end-to-end AI-powered QA platform that transforms raw customer service audio into structured, actionable insights
Successfully integrated Soniox for real-time speech-to-text and the OpenAI API for contextual QA analysis and coaching
Designed a system that automatically identifies good, bad, uncertain, and needs improvement moments across entire conversations
Enabled quality coaches to reduce manual call review time by up to 50–70% by surfacing only high-impact moments
Implemented real-time agent suggestions that guide tone, empathy, and sales techniques during live calls
Created a configurable QA rubric system that aligns AI scoring directly with a company’s existing marking scheme
Built explainable QA reports with timestamps and transcript evidence to increase trust and usability

📚 What we learned

Building QAI taught us that quality assurance is about far more than checking boxes or following scripts. We learned that the way something is said, tone, pacing, pauses, and emotional shifts, often have a greater impact on customer experience than the exact words used. Capturing these nuances required us to think beyond simple transcription and focus on how conversations evolve.

From a technical perspective, we learned the importance of structure when working with AI. Models perform best when guided by clear QA rubrics, confidence thresholds, and well-defined categories. Introducing an “uncertain” classification proved especially valuable, as it allowed the system to remain honest about ambiguity rather than forcing it to draw incorrect conclusions. This significantly improved both reliability and trust in the outputs.

We also learned that explainability is critical in quality assurance. Scores and insights are only useful if quality coaches can understand why a call was evaluated the way it was. By tying every QA flag back to transcript evidence and timestamps, we ensured that AI insights supported, rather than replaced, human judgment.

Finally, building QAI reinforced the importance of thoughtful system design. Real-time analysis, performance constraints, and rapid iteration across prompts, logic, and UI required disciplined version control and modular architecture. Overall, the project showed us that AI is most effective when it works alongside humans, reducing manual effort, guiding attention, and enabling better coaching, rather than acting as a black-box decision-maker.

What's next for QAI

QAI’s next step is to transform quality assurance from a manual, sampled process into a continuous, organization-wide capability. As adoption increases, QAI will support thousands of concurrent live calls, multiple teams, and custom QA rubrics. This will provide supervisors and quality coaches with full visibility into agent performance without increasing their review workload. By enhancing enterprise readiness, QAI will enable organizations to review more calls, identify issues sooner, and deliver faster, more consistent coaching.

Looking ahead, QAI will become more adaptive and proactive by learning from human feedback, identifying long-term behavioral patterns, and surfacing coaching opportunities before issues affect customer experience. The platform will expand beyond voice to support chat, email, and other customer touchpoints, and will integrate directly with existing call center systems. The long-term vision is for QAI to serve as the quality intelligence layer for customer experience, scaling empathy, consistency, and performance without increasing manual effort.