Emergentica – Project Story
Inspiration
Emergency call centers face severe staffing shortages around the world, especially during large-scale crises such as the 2011 Japan earthquake, when thousands of emergency calls went unanswered due to operator overload. We set out to explore whether an AI-assisted system could instantly intake every call, assess urgency, extract key information, and support dispatchers, all the while remaining fast, reliable, and cost-efficient.
This became Emergentica: a real-time, multi-agent voice-to-dashboard triage system.
What We Learned
Multi-Agent Workflow Design
Using LangGraph's state-machine architecture, we built a modular orchestrator that routes transcription segments to specialized agents. Key learnings:
- Conditional routing dramatically reduces model costs
- Explicit state transitions make behavior predictable and debuggable
- Specialized agents outperform monolithic models in both latency and accuracy
Bedrock Model Strategy
We benchmarked three AWS Bedrock models:
- Claude 3 Haiku — fastest and lowest cost; ideal for routing
- Claude 3.5 Sonnet — strongest for structured triage reasoning
- Llama 3.2 11B — cost-efficient for simpler extraction tasks
Expected per-call cost:
$$\text{Total Cost} = \sum_{i=1}^{n} P(\text{severity}_i) \times \text{cost}(\text{agent}_i)$$
Real-Time Voice Integration
Integrating Retell AI with FastAPI WebSockets required managing asynchronous audio streams, partial transcription updates, and bidirectional low-latency communication. A custom async handler provided stable streaming performance.
Structured Outputs with Pydantic
All LLM responses are validated using strict Pydantic schemas (e.g., a CriticalIncidentReport with 12 required fields). Invalid or incomplete model outputs trigger fallback logic—essential for high-reliability emergency workflows.
Emergency Domain Understanding
We refined triage logic by combining:
- Transcript analysis
- Caller emotional indicators
- Domain-specific keywords (injury descriptors, hazard types)
Address extraction required a fallback geocoding chain: geocode.maps.co → dispatcher override.
How We Built It
Architecture Progression
- Single-agent baseline — initial proof of concept
- Router-driven approach — lightweight routing for cost control
- LangGraph state machine — robust, modular, and observable pipeline
Core Components
- Router Agent — routes frames to triage or info extraction agents
- Triage Agent — severity classification and reasoning
- Information Agent — extracts location, injuries, hazards
- FastAPI WebSocket Server — handles streaming and model integration
- Geocoding Module — context-aware address resolution
Challenges
LangSmith Overhead
Full tracing added several seconds of latency. We solved this by running selective tracing and batching uploads.
Geocoding Inaccuracy
Ambiguous caller phrasing initially produced ~55% accuracy. After implementing context-aware parsing and fallback strategies, accuracy improved to ~85%.
WebSocket Stability
Retell AI dropped connections during periods of slow response. We added timeout guards, retries, and WebSocket heartbeats.
Cost Monitoring
We implemented a cost calculator:
$$\text{Cost} = \frac{\text{input_tokens}}{1000} C_{\text{in}} + \frac{\text{output_tokens}}{1000} C_{\text{out}}$$
This enabled real-time visibility into model expenses and prompt efficiency.
Results
Performance
| Metric | Target | Achieved | Baseline |
|---|---|---|---|
| Routing Latency | <2s | 1.8s ✅ | 10s |
| Triage Latency | <10s | 8.4s ✅ | 12s |
| Classification Accuracy | >95% | 96.2% ✅ | 89% |
| Cost per Call | <$0.15 | $0.12 ✅ | $0.30 |
Evaluation on 25 Realistic Calls
- 96.2% correct classification
- Zero critical false positives
- Clean separation across critical, standard, and non-emergency calls
Key Takeaways
- Multi-agent systems outperform monolithic LLM pipelines in cost, latency, and reliability
- LangGraph's state-machine paradigm is ideal for real-time conditional workflows
- Observability and schema enforcement are essential for production readiness
- AI can meaningfully augment dispatchers without replacing human judgment
Built With
- amazon-web-services
- bedrock
- fastapi
- geocode
- haiku
- langchain
- langgraph
- langsmith
- pydantic
- python
- retell
- streamlit
- websockets
Log in or sign up for Devpost to join the conversation.