Inspiration
In today's fast-paced work environment, teams struggle to coordinate tasks while on the go. We envisioned a productivity system where natural voice conversations could seamlessly manage team collaboration, create tasks, schedule meetings, and when needed, connect directly to human support agents.
Traditional productivity tools require context switching between apps, manual data entry, and separate systems for voice support. We asked ourselves: "What if your team's productivity assistant could understand natural speech, manage complex workflows, and intelligently transfer you to a human when needed - all in one seamless experience?"
Inspired by the potential of AI-powered voice interfaces, we set out to build an enterprise-grade system that combines the intelligence of LangGraph AI agents with the reliability of established call center infrastructure.
What it does
Sambanova Voice AI Productivity System is an enterprise voice assistant that enables teams to manage their entire productivity workflow through natural conversation:
Core Capabilities:
🎤 Voice-First Interaction:
- Call a phone number, authenticate with a voice PIN (4-6 digits)
- Natural language conversation with GPT-4 powered AI assistant
- Create todos, reminders, and calendar events by simply speaking
- Barge-in capability: interrupt the AI mid-sentence to course-correct
👥 Team Collaboration:
- Multi-tenant architecture supporting multiple teams
- Role-based access control (Owner, Admin, Member, Viewer)
- Assign tasks to team members by voice: "Assign code review to John in the dev team"
- Search users, manage team membership, create team-specific todos
📞 Intelligent Call Transfer:
- Seamlessly transfer from AI to human agents when needed
- Automatic detection: "I need to speak to a human" triggers transfer
- FreePBX integration with SIP call routing
- Support for multiple departments (support, sales, general)
🌐 WebRTC Call Center:
- Browser-based softphone for agents (JsSIP)
- Real-time SIP registration with FreePBX
- Call controls: answer, hold, transfer, hangup
- Agent dashboard with status management
📊 Production-Grade Monitoring:
- Sentry integration for real-time error tracking
- Performance metrics: agent processing time, tool execution duration
- Automatic error recovery with thread reset
- Custom alerts for timeouts and failures
🔄 Smart Error Recovery:
- Automatic conversation thread reset after timeouts
- Handles MCP connection issues gracefully
- Prevents cascading failures
- User-friendly error messages
How we built it
Technology Stack:
AI & NLP:
- LangGraph for agent orchestration with stateful conversations
- LangChain for LLM tool binding and integration
- OpenAI GPT-4 Turbo for natural language understanding
- Model Context Protocol (MCP) - 38 custom tools for database operations
Voice Integration:
- Twilio Programmable Voice for phone call handling
- Twilio Media Streams for real-time audio (WebSocket)
- OpenAI Whisper for speech-to-text (via Twilio)
- Polly.Amy for text-to-speech responses
Call Center Integration:
- FreePBX on Google Cloud VM for call routing
- SIP/WSS protocol for voice connectivity
- JsSIP v3.10.1 for WebRTC browser softphone
- Direct SIP transfer from Twilio to FreePBX
Backend:
- Flask + Flask-SocketIO for HTTP/WebSocket server
- SQLAlchemy ORM with PostgreSQL (multi-tenant)
- PIN authentication for voice access (4-6 digit)
- Google Calendar OAuth2 integration
Monitoring & Reliability:
- Sentry.io for error tracking and performance monitoring
- Custom timeout optimization (8s/10s/12s layers)
- Thread reset mechanism with timestamped IDs
- BrokenResourceError handling
Deployment:
- Render.com for auto-deployment
- gunicorn + eventlet for async worker management
- Google Cloud Platform for FreePBX VM
- GitHub Actions for CI/CD
Architecture Highlights:
Three-Layer Timeout System:
- Tool execution: 8 seconds
- Agent processing: 10 seconds
- Webhook response: 12 seconds
- Stays under Twilio's 15-second HTTP limit
MCP Tool Integration:
- 36 database tools (todos, calendars, teams, users)
- 2 call transfer tools (transfer_to_agent, get_available_departments)
- Optimized for voice: removed slow Google Calendar sync
- Simplified JSON responses to prevent MCP protocol issues
Error Recovery Flow:
- Detect timeout/error → Mark user for reset
- Next request → Use timestamped thread_id
- Fresh conversation state → No incomplete tool calls
- Prevents OpenAI's "tool_call_id must be followed" errors
Challenges we ran into
1. Twilio Timeout Hell
Problem: Tools took 20-30 seconds, Twilio dropped calls at 15 seconds. Users never heard success messages even though operations completed.
Solution:
- Reduced timeouts at every layer (8s/10s/12s)
- Removed Google Calendar sync from voice flow
- Simplified tool responses
- Result: 95%+ operations complete in time
2. Cascading Tool Call Errors
Problem: When a tool timed out, it left an incomplete tool_call_id in conversation history. Every subsequent request failed with "An assistant message with 'tool_calls' must be followed by tool messages..."
Solution:
- Implemented automatic thread reset detection
- Use timestamped thread IDs after errors
- In-memory tracking of users needing reset
- Self-healing conversation state
3. BrokenResourceError in MCP
Problem: MCP connection broke during tool responses, causing BrokenResourceError with empty error messages that confused the LLM.
Solution:
- Simplified tool return values
- Added specific error handling for BrokenResourceError
- Removed large JSON serialization (Pydantic model_dump_json)
- Return simple strings instead of complex objects
4. Call Transfer Configuration
Problem: Twilio couldn't dial FreePBX directly. Initially tried wrong approaches (routing through trunk number, creating inbound routes that bypass AI).
Solution:
- Understood the correct flow: User → AI → Transfer → FreePBX
- Configured Google Cloud firewall for Twilio IPs
- Direct SIP URI:
<Dial><Sip>sip:2000@34.26.59.14</Sip></Dial> - Added
answer_on_bridge=trueto wait for agent pickup
5. WebRTC Library Issues
Problem: SIP.js library wouldn't load from CDN (SIP is not defined error).
Solution:
- Switched from SIP.js to JsSIP
- Downloaded library locally (283KB)
- Updated all SIP client code to JsSIP API
- Made port configurable (7443 vs 443)
6. SQLAlchemy Reserved Words
Problem: Used metadata as column name in AgentActivity model - SQLAlchemy reserves this for internal use.
Solution:
- Renamed to
extra_data - Updated SQL schema
- Fixed before first deployment
7. Date Defaulting to October 10
Problem: System prompt had hardcoded old date, todos/events created with wrong dates.
Solution:
- Updated system prompt to use current date (2025-10-17)
- Updated all examples in prompt
- Ensures todos are created with correct due dates
Accomplishments that we're proud of
🏆 Production-Grade Error Monitoring
Integrated Sentry with custom instrumentation:
- Real-time tracking of every voice call as a transaction
- Custom measurements for agent processing time
- Automatic error categorization and alerting
- User context tracking (call SID, user ID, transcript)
🎯 Self-Healing Error Recovery
Built an intelligent thread reset system that:
- Detects incomplete tool calls automatically
- Resets conversation state with timestamped thread IDs
- Prevents cascading failures
- Recovers gracefully without user intervention
📞 Seamless AI-to-Human Handoff
Achieved smooth call transfer:
- Voice AI detects transfer intent ("I need to speak to someone")
- Automatically routes to appropriate department
- Integrates with existing FreePBX call center infrastructure
- No dropped calls during handoff
⚡ Sub-Second Response Times
Optimized the entire stack:
- Reduced timeouts by 60% (30s → 12s)
- Removed blocking Google Calendar API calls
- Simplified MCP protocol responses
- 95%+ of operations complete successfully
🎨 Comprehensive Documentation
Created 15+ technical guides:
- Call transfer setup (3 detailed guides)
- Timeout troubleshooting
- WebRTC configuration
- FreePBX Google Cloud setup
- Sentry integration instructions
🛠️ 38 Working MCP Tools
Built a robust tool ecosystem:
- Personal productivity (todos, reminders, calendar)
- Team collaboration (teams, members, assignments)
- Call management (transfer, department routing)
- Database operations (custom SQL queries)
What we learned
1. Timeout Layers Matter
We learned that managing timeouts isn't just about setting one value - you need coordinated timeouts at every layer (tool → agent → webhook) that account for external service limits (Twilio's 15s HTTP timeout).
2. Conversation State is Fragile
LLM conversation threads can break in subtle ways (incomplete tool calls). We learned to build self-healing mechanisms and never assume the state is clean.
3. MCP Protocol Optimization
Large JSON responses can break MCP connections. Simplifying return values and avoiding complex serialization improved reliability dramatically.
4. Call Flow Architecture
Understanding the correct call flow is critical. We learned the difference between:
- Inbound routing (User → FreePBX directly) ❌
- Transfer routing (User → AI → FreePBX) ✅
5. Google Cloud Networking
Running FreePBX on GCP requires TWO firewall layers:
- Google Cloud VPC firewall rules
- FreePBX VM-level firewall Both must allow Twilio IPs for SIP/RTP traffic.
6. Production Observability
Adding Sentry early would have saved hours of debugging. Real-time error tracking and performance metrics are invaluable for diagnosing timeout issues.
7. Library Selection Matters
Not all JavaScript libraries work equally well with CDNs. JsSIP proved more reliable than SIP.js for browser-based softphone implementation.
8. Voice UX is Different
Voice interfaces require:
- Brief, conversational responses (not verbose)
- Immediate feedback ("Creating your event...")
- Graceful degradation ("The task may complete in the background...")
- Barge-in support for natural conversation flow
What's next for Sambanova
Immediate Roadmap (Post-Hackathon):
🔊 Voice Experience Enhancements:
- WebSocket-based streaming for even faster responses
- Custom wake word detection ("Hey Sambanova")
- Multi-language support (Spanish, Mandarin)
- Voice biometric authentication (replace PIN)
📊 Analytics & Insights:
- Redpanda event streaming for real-time analytics
- Team productivity dashboards
- Call volume metrics and heatmaps
- AI vs Human agent routing optimization
🤖 Intelligent Routing:
- ML-based intent classification for faster tool selection
- Sentiment analysis for automatic escalation
- Context-aware department routing
- VIP customer prioritization
🔐 Security Enhancements:
- OAuth2 for team SSO integration
- Role-based field-level permissions
- Audit logging for compliance
- End-to-end encryption for voice calls
Medium-Term Vision:
🌍 Enterprise Features:
- Multi-region deployment for global teams
- Custom AI model fine-tuning per organization
- Slack/Teams/Discord integrations
- Email summary of voice conversations
- CRM integration (Salesforce, HubSpot)
📱 Mobile Apps:
- Native iOS/Android apps
- Push notifications for task assignments
- Offline mode with sync
- Voice memos that auto-create todos
🎯 Advanced AI Capabilities:
- Proactive task suggestions based on patterns
- Smart scheduling with conflict detection
- Natural language search across all team data
- Meeting transcription and action item extraction
Long-Term Goals:
🏢 Platform Evolution:
- Marketplace for custom MCP tools
- White-label solution for enterprises
- Plugin architecture for third-party integrations
- GraphQL API for advanced clients
🧠 AI Innovation:
- Multi-agent workflows (AI agents collaborating)
- Predictive task prioritization using team history
- Automated project timeline generation
- AI-powered meeting facilitation
🌐 Scale & Performance:
- Redis-based session storage (replace in-memory)
- Horizontal scaling with load balancing
- WebSocket cluster for high concurrency
- Edge deployment for low-latency worldwide
📈 Business Intelligence:
- Team productivity analytics
- Bottleneck identification
- Resource allocation optimization
- Performance benchmarking
Why Sambanova?
This project showcases the power of combining:
- 🤖 AI Intelligence (LangGraph + OpenAI)
- 📞 Traditional Telephony (Twilio + FreePBX)
- 👥 Team Collaboration (Multi-tenant + RBAC)
- 🔍 Production Monitoring (Sentry)
- ⚡ Performance Engineering (Timeout optimization)
We built more than a demo - we built a production-ready system with:
- Error monitoring and alerting
- Self-healing error recovery
- Comprehensive documentation (15+ guides)
- Performance optimization based on real constraints
- Enterprise security (PIN auth, JWT, bcrypt)
This represents the future of productivity tools: intelligent, conversational, and seamlessly integrated with human support when AI reaches its limits.## Inspiration
Log in or sign up for Devpost to join the conversation.