caller screen
google sso
department selection
dispatcher screen
Unit dispersing screen

RapidResponse.ai — Project Story

Inspiration

Legacy 911 Computer-Aided Dispatch (CAD) systems are decades old. Many municipal dispatch centers still run software built in the 1990s — green-screen terminals, manual code lookups, dispatchers furiously typing while simultaneously talking to a panicked caller and radioing units. During peak hours a single dispatcher can be managing six or more simultaneous incidents, each demanding full attention.

The cognitive load is brutal, and the consequences of overload are measured in minutes — and lives.

We asked a simple question: what if the first responder to every call wasn't a human at all, but an AI that never gets tired, never mishears a cross-street, and can simultaneously query every protocol in the department's manual in under a second?

AWS Bedrock's Nova Sonic 2 made that question answerable. It is the first generally available, fully bidirectional voice model with native tool use — capable of listening, speaking, and triggering backend actions all within a single streaming session. That combination unlocked something that wasn't possible even a year ago: a fully autonomous AI triage agent that speaks to callers in real time, follows MPDS-style protocols, and hands a classified, annotated incident to a human dispatcher the moment they need to act.

But the real inspiration came from a darker place. In 2019, a woman in Ohio called 911 and ordered a pepperoni pizza. The dispatcher was confused — until he realized she was a domestic violence victim who couldn't speak freely with her abuser in the room. He saved her life by listening between the lines. Most AI systems would have hung up. We built one that doesn't.

What We Built

RapidResponse.ai is a municipal-grade AI-powered 911 emergency dispatch platform. An AI voice agent autonomously handles incoming emergency calls from any browser, triages callers using RAG-backed emergency protocol documents, classifies incidents by type and priority, and surfaces live structured data to human dispatchers and field unit officers on a React dashboard — all in real time.

Core Capabilities

Live AI voice call — callers speak from any browser; Nova Sonic 2 responds in natural, empathetic speech with barge-in support for natural turn-taking
RAG-backed protocol retrieval — MPDS-style protocol documents chunked, embedded via Titan Embeddings v2, and stored in LanceDB; top-3 relevant chunks injected into every Nova Sonic turn
Dynamic incident classification — type (medical, fire, police, hazmat, search_rescue, traffic) and priority P1–P4, determined autonomously by the voice agent mid-call
Covert distress detection — when a caller orders a "pizza," whispers, or gives only yes/no answers, the AI detects they cannot speak freely, switches to a safe yes/no questioning mode, and flags the dashboard with "SILENT APPROACH — NO SIRENS" instructions
Role-based dashboard — separate views for dispatchers (god-view, assign any unit) and unit officers (three-tab workflow: Active → Working → Past, can only self-assign and manage their own incidents)
Bidirectional dispatch Q&A — dispatchers and field officers type questions that get silently injected into the live Sonic session; the AI asks the caller naturally and streams the extracted answer back
Department-specific dispatch — when units are assigned, the AI tells the caller exactly who is coming: "Police officers are on their way" or "An ambulance with paramedics has been dispatched" — not generic "help is on the way"
Multi-agent AI ecosystem — four specialized agents collaborating: voice agent, dispatch bridge agent, report agent, and triage agent
Auto-assignment & escalation — emergency incidents (P1–P2) trigger automatic unit dispatch; the triage agent monitors for escalation signals and suggests battalion increases
Full audit trail — every transcript turn, dispatch action, and Q&A exchange saved to libSQL; raw audio stored in S3

How We Built It

Runtime & Monorepo

We used Bun as the JavaScript/TypeScript runtime throughout — no Node.js, no ts-node. Bun executes TypeScript natively, ships a built-in HTTP/WebSocket server (Bun.serve()), and starts in under 30ms. The project is a Bun workspace monorepo: backend/ (Bun HTTP + WebSocket server) and frontend/ (React 18 + Vite), orchestrated from the root package.json.

The Voice Agent — Nova Sonic Bidirectional Stream

The core of the system is a persistent bidirectional HTTP/2 stream to AWS Bedrock Nova Sonic 2, opened via InvokeModelWithBidirectionalStreamCommand. This is not a request/response API — it is a long-lived stream that simultaneously carries audio frames inbound and AI audio + transcript + tool calls outbound.

Audio from the browser is captured as raw PCM 16-bit, 16 kHz, mono using a ScriptProcessorNode at 512 samples per frame (~32 ms chunks). Each frame is base64-encoded and sent as a WebSocket message to the backend, which wraps it in the Nova Sonic audioInput event format and forwards it into the Bedrock stream.

The PCM normalization from float32 (Web Audio API) to int16 (Nova Sonic LPCM):

$$s_{\text{int16}} = \text{clamp}!\left(\lfloor s_{\text{float32}} \times 32767 \rceil,\ -32768,\ 32767\right)$$

Nova Sonic responds with 24 kHz PCM audio, which is decoded on the client:

$$s_{\text{float32}} = \frac{s_{\text{int16}}}{32768}$$

The agent uses tool use to trigger backend actions mid-conversation:

Tool	Trigger	Action
`classify_incident`	Enough information gathered	Update type + priority in libSQL, push SSE, trigger auto-assignment
`get_protocol`	Protocol guidance needed	RAG query → inject top-3 chunks into next turn
`dispatch_unit`	Unit needed	Create dispatch record, notify dashboard, tell caller who is coming
`flag_covert_distress`	Silent distress detected	Set `covert_distress=1`, push SSE with silent approach flag

One critical implementation detail: tool results must be sent when the stream emits contentEnd with stopReason: "TOOL_USE" — not on the toolUse event itself. Getting this wrong causes the session to hang indefinitely.

Multi-Agent Architecture

The system delegates specialized tasks to four distinct agents, each optimized for a different type of reasoning:

1. Nova Agent (novaAgent.ts) — The Frontline Responder The core voice agent interacting directly with the 911 caller via Nova Sonic 2's bidirectional audio stream. It handles the full human conversation — listening, speaking back in natural empathetic speech, and silently invoking tools (classify_incident, dispatch_unit, get_protocol, flag_covert_distress) when it has gathered enough information. It supports barge-in (caller can interrupt mid-sentence), handles session renewal at 7m30s for calls approaching the 8-minute limit, and adapts its conversational strategy when it detects covert distress.

2. Dispatch Bridge Agent (dispatchBridgeAgent.ts) — The Translator A Nova Lite-powered agent that bridges the gap between human dispatchers/officers and the live caller conversation. When a dispatcher types a raw question like "ETA on suspect departure?", the bridge agent uses refineQuestion() to rephrase it into natural, empathetic 911 operator language: "Can you tell me — how long ago did the person leave?" It also runs extractAnswer() to scan the live transcript and detect when the caller has answered the injected question, immediately surfacing the structured answer to the dashboard via SSE.

3. Report Agent (reportAgent.ts) — The Synthesizer A background Nova Lite agent that periodically processes the accumulated call transcript to produce evolving structured incident reports. Every ~30 seconds during an active call, it generates an updated JSON extraction covering caller details, incident classification, hazard assessment, and recommended actions. On case closure, it generates a final 2–3 paragraph prose summary suitable for department records, incorporating dispatch actions taken, units involved, and resolution status.

4. Triage Agent (triageAgent.ts) — The Deterministic Guardian A pure local-logic agent — no LLM calls, no AI inference. It uses classical rule-matching for speed and safety, constantly scanning incoming transcript and extraction data for high-impact keywords and signal combinations. When it detects signals like "gun," "not breathing," "gas leak," or multiple victims, it immediately triggers escalation suggestions and auto-dispatch recommendations. The escalation logic uses a weighted signal score:

$$P_{\text{score}} = \sum_{i} w_i \cdot x_i, \quad \text{escalate if } P_{\text{score}} \geq \theta$$

where $x_i \in {0,1}$ are binary signals (weapon present, unconscious caller, multiple victims, structural fire, chemical hazard) and $w_i$ are protocol-defined weights. The threshold $\theta$ is calibrated per incident type. This agent is deliberately NOT an LLM — in safety-critical escalation decisions, deterministic rules with known behavior are more trustworthy than probabilistic language models.

Covert Distress Detection — The Pizza Pattern

This is our flagship differentiation. When a domestic violence victim calls 911 and says "I'd like to order a large pepperoni pizza," most systems — and most AI systems — would terminate the call. RapidResponse.ai recognizes the pattern.

The Nova Agent's system prompt includes explicit training for ten categories of covert distress: fake food/service orders, silent open lines with background violence sounds, whispering callers, one-word answers with tense tone, sudden "wrong number" backtracks, child callers reporting parental violence, bystander calls from public places, hostage situations, vehicle kidnapping, and coercion signals (caller answers YES to contradictory questions).

When detected, the agent:

Never breaks cover — does not say "911," "emergency," or "police" aloud
Pivots to yes/no questioning — "I understand. Are you able to talk freely right now?"
Fires flag_covert_distress tool — dashboard instantly shows a 🤫 COVERT badge with "SILENT APPROACH — NO SIRENS" dispatch instructions
Extracts critical info through safe questions — location, weapon presence, children, injury status — all via yes/no format
Dispatches silently — responding units receive explicit "no sirens, no lights on approach" instructions

RAG — Protocol Retrieval with LanceDB

Emergency protocol documents (PDF, TXT, Markdown) are chunked into 512-token segments with 50-token overlap, embedded via AWS Bedrock Titan Embeddings v2 (1024-dimensional vectors), and stored in LanceDB — an embedded vector database that runs in-process with zero infrastructure.

At query time, the top-3 protocol chunks are retrieved by cosine similarity:

$$\text{sim}(\mathbf{q}, \mathbf{d}) = \frac{\mathbf{q} \cdot \mathbf{d}}{|\mathbf{q}| \cdot |\mathbf{d}|}$$

where $\mathbf{q}$ is the query embedding and $\mathbf{d}$ is each stored chunk embedding. LanceDB must be configured with distanceType("cosine") consistently at both index creation and query time — using the default "l2" produces incorrect rankings for Titan embeddings.

Geospatial Proximity — S2 Cell Geometry

Unit proximity is indexed using S2 cell tokens — a hierarchical spherical geometry system that tiles the Earth into a quad-tree of cells. Each caller location and unit position is encoded as an S2 cell token at level 13.

The cell area at level $\ell$ is approximately:

$$A_\ell \approx \frac{4\pi}{6 \cdot 4^\ell} \text{ steradians} \approx \frac{510{,}000{,}000}{6 \cdot 4^\ell} \text{ km}^2$$

At $\ell = 13$: $A_{13} \approx 1.27 \text{ km}^2$, giving useful neighborhood-level granularity without expensive PostGIS infrastructure. S2 tokens are stored as Utf8 strings in LanceDB and used as pre-filters on cosine vector search.

Storage Architecture

We enforced strict separation of concerns across three stores:

Store	Used For
libSQL (embedded file)	All structured relational data — incidents, transcripts, units, dispatches, Q&A, sessions, audit logs
LanceDB (embedded)	Vector collections — protocols, incident history, geolocations
AWS S3	Binary data — raw audio chunks, final transcript JSON exports

libSQL runs as an embedded file (file:///data/rapidresponse.db) with zero server setup. Both databases write to a named Docker volume (/data) so data survives container replacement.

Role-Based Dashboard

The dispatcher dashboard supports two distinct roles with different permissions:

Dispatchers have full god-view access — they can see all incidents, all transcripts, assign any unit to any incident, ask questions on any call, and close any case.

Unit Officers log in as a specific unit (e.g., "Patrol P-2") and see a three-tab workflow:

Active — all incoming incidents; officer can click "I'll Respond" to self-assign
Working — incidents they're actively responding to, with full detail: live annotated transcript, AI report, Q&A panel, escalation, backup request, and case closure
Past — their completed incidents with AI-generated reports

Both roles can ask the caller follow-up questions through the bidirectional Q&A system. The dashboard communicates via 18 distinct SSE event types for real-time updates without polling.

Deployment

The system ships as two Docker containers: a Bun backend (oven/bun:1.2-alpine) and a Caddy frontend (caddy:2-alpine). The frontend Dockerfile uses a two-stage build — Vite inlines VITE_* Firebase keys at build time via Docker ARG. Caddy handles WebSocket upgrade pass-through, and SSE buffering is disabled with flush_interval -1 to ensure each event reaches the dashboard instantly. The backend targets AWS ECS (Fargate) with credentials injected via IAM Task Role — never baked into the image.

Challenges

Nova Sonic's HTTP/2-only requirement. The standard AWS SDK HTTP handler does not support HTTP/2 bidirectional streams. We had to explicitly configure NodeHttp2Handler from @smithy/node-http-handler. Without it, the stream silently fails to open.

Getting Nova Sonic to speak first. Nova Sonic does not spontaneously initiate speech. Sending silence only returns usageEvent payloads — the model waits. The solution: inject a USER text content block ("." with interactive: true) immediately after session start, which triggers the "911, what's your emergency?" greeting.

Tool schema double-encoding. The inputSchema.json field in Nova Sonic tool definitions must be a JSON string (i.e., JSON.stringify({...})), not a plain object. Passing an object causes the session to return a cryptic "unexpected error". This is not documented clearly in the AWS SDK — we discovered it by diffing against AWS's own Python reference examples.

LanceDB native addon in Docker. @lancedb/lancedb ships a compiled C++ .node addon. This means bun build --compile (single binary) is not viable — the native addon cannot be bundled. Instead, the Docker image copies both dist/ and node_modules/ to the runner stage and installs libstdc++ via Alpine's package manager.

Browser AudioContext auto-suspend. Browsers suspend AudioContext after a period of inactivity. Attempting to play Nova Sonic's audio response on a suspended context produces silence. The fix: call ctx.resume() before scheduling each buffer source — and handle the promise correctly to avoid a race condition.

SSE connection timeouts. Bun's default idleTimeout is 10 seconds — enough to kill every SSE connection on the dispatcher dashboard within the first idle moment. Setting idleTimeout: 255 (the maximum Bun allows) resolved this.

Covert distress false positives. Tuning the balance between detecting genuine covert calls (where missing the signal could cost a life) and not dispatching units to every accidental dial was one of the hardest design challenges. Our approach: the AI never dismisses a call with any distress signal, but uses contradiction detection (asking two questions that should have opposite answers) to verify coercion before escalating to full emergency dispatch.

What We Learned

Building RapidResponse.ai taught us that the gap between "AI demo" and "production-grade AI system" is almost entirely in the infrastructure plumbing — not the model itself. Nova Sonic is remarkably capable out of the box; the hard problems were HTTP/2 stream lifecycle management, binary audio encoding, database schema evolution, and container image design.

The multi-agent architecture was a key lesson. A single monolithic LLM cannot reliably handle real-time voice conversation, background report generation, dispatch question refinement, and safety-critical escalation simultaneously. By splitting these into four agents — a fast voice model for the caller, a lightweight text model for summarization and translation, and a deterministic rule engine for safety — each component does what it does best, and the system is more reliable than any single model could be.

We also learned to respect the data separation principle deeply. The instinct to throw everything into one database is strong. Enforcing strict boundaries — vectors in LanceDB, relations in libSQL, blobs in S3 — kept each system doing what it does best and made the codebase dramatically easier to reason about as it scaled from a proof-of-concept to a platform with 9 migrations, 18 SSE event types, and a full role-based access system.

Most importantly: emergency dispatch is a domain where the cost of failure is measured in human lives. That kept us honest about every architectural decision, every edge case, and every line of error handling. AI can augment dispatchers and reduce cognitive load — but only if the system beneath it is genuinely reliable.

Amazon Nova Models Used

Model	Role	Why This Model
Nova Sonic 2	Real-time voice agent — bidirectional audio stream with tool use	Only model that combines speech-to-speech with native tool invocation in a single streaming session. No other model can listen, speak, and trigger backend actions simultaneously.
Nova Lite	Report generation, question refinement, answer extraction	Fast, cost-effective reasoning for background text processing. Runs every ~30s for report updates without burning through credits.
Titan Embeddings v2	Protocol document embedding for RAG	1024-dimensional vectors optimized for cosine similarity. Powers the protocol retrieval that gives Nova Sonic real-time access to emergency procedures.

Built With

amazon-nova-sonic · amazon-nova-lite · amazon-titan-embeddings · aws-bedrock · bun · typescript · react · vite · lancedb · libsql · aws-s3 · firebase-auth · arcgis · docker · aws-ecs · server-sent-events · websocket

Built With

alb
amazon-web-services
bedrock
bun
docker
ecs
lancedb
libsql
nova
react
sonic
typescript
vite

RapidResponse.ai — Project Story

Inspiration

What We Built

Core Capabilities

How We Built It

Runtime & Monorepo

The Voice Agent — Nova Sonic Bidirectional Stream

Multi-Agent Architecture

Covert Distress Detection — The Pizza Pattern

RAG — Protocol Retrieval with LanceDB

Geospatial Proximity — S2 Cell Geometry

Storage Architecture

Role-Based Dashboard

Deployment

Challenges

What We Learned

Amazon Nova Models Used

Built With

Built With

Updates