Inspiration
Cabin crew operate in one of the most demanding work environments - 35,000 feet in the air, responsible for hundreds of passengers, with zero room for error. During turbulence, medical emergencies, or security incidents, they need instant access to safety procedures. But flipping through paper manuals or typing on a tablet while managing a crisis isn 't realistic. We asked: what if crew could just talk to their safety manual and it would talk back?
What it does
Cabin Copilot is a real-time voice AI assistant that cabin crew can speak to naturally - no screens, no typing, no delays.
- Ask safety questions by voice → "What's the procedure for an unconscious passenger?" → Cabin Copilot searches the airline's safety manuals and speaks the answer back instantly, grounded in real documentation.
- Log incidents hands-free → "Log a high severity incident on flight AA1234 , passenger had an allergic reaction" → The incident is validated, timestamped, and written to the database by voice. Done.
- Live dashboard → Logged incidents appear in real time on a color-coded severity dashboard (🔴 CRITICAL, 🟠 HIGH, 🟡 MEDIUM, 🟢 LOW), giving the crew a shared operational picture.
- Barge-in support → Crew can interrupt the assistant mid-sentence, it stops, listens, and responds to the new input immediately.
How I built it
We used a spec-driven development approach with Kiro, starting from EARS-style requirements → architecture design → implementation tasks.
The core stack:
- Amazon Nova Sonic for real-time speech-to-speech , audio flows both directions simultaneously over a single WebSocket connection
- Strands Agents BidiAgent for bidirectional streaming agent orchestration with tool use
- Bedrock Knowledge Bases with S3 Vectors (1024-dim cosine index) and Titan Embed Text V2 for RAG over safety manual PDFs
- DynamoDB for incident logging with a GSI on FlightNumber+Timestamp
- FastAPI backend with WebSocket endpoint handling concurrent input/output streams via asyncio.gather
- AWS CDK (Python) for fully automated infrastructure — one cdk deploy provisions everything (DynamoDB, S3, Bedrock KB, S3 Vectors, IAM roles, PDF upload, ingestion trigger )
- Single-file HTML/JS client using Web Audio API for mic capture (PCM 16kHz mono), real-time audio playback with resampling, live transcript bubbles, and a live incident dashboard
We also wrote property-based tests using Hypothesis to verify correctness properties: incident ID uniqueness, record completeness, severity validation, configuration defaults, and search tool return guarantees.
Challenges I ran into
- Bidirectional audio streaming — Getting full-duplex audio working over a single WebSocket (client mic → server → Nova Sonic → server → client speakers) with concurrent input and output was the trickiest part. Standard request/response patterns don't work here; we needed BidiAgent with proper async stream management.
- Audio resampling — Nova Sonic outputs at 16kHz but browser audio contexts run at the device's native sample rate (often 48kHz). We had to implement linear interpolation resampling on the client side to avoid garbled playback.
- Gapless audio playback — Streaming audio chunks arrive independently, so we had to schedule each chunk at precisely the right time using nextPlayTime tracking to avoid clicks and gaps between chunks.
- CDK for Bedrock KB + S3 Vectors — The CDK constructs for Bedrock Knowledge Bases and S3 Vectors are relatively new. Wiring up the vector bucket, index, knowledge base, data source, IAM roles, and ingestion pipeline required careful dependency management.
- Testing voice tools without voice — We separated pure business logic (build_incident_record()) from I/O (DynamoDB writes) to enable property-based testing without mocks or real AWS calls.
Accomplishments that I'am proud of
- Fully voice-driven, zero-UI workflow — Crew can search manuals and log incidents entirely by voice. No typing, no tapping, no context switching.
- One-command infrastructure — ./deploy.sh provisions everything from scratch: DynamoDB, S3, Bedrock KB, vector index, IAM roles, uploads safety PDFs, and triggers ingestion. ./teardown.sh removes everything cleanly.
- Property-based testing — 7 Hypothesis tests validate correctness properties across hundreds of randomized inputs, not just hand-picked examples. Every test traces back to a specific requirement.
- Spec-driven development with Kiro — Requirements → design → tasks → implementation, with full traceability. Every line of code maps to a requirement.
- Real-time incident dashboard — Incidents logged by voice appear live on a color-coded dashboard, giving the whole crew situational awareness.
What we learned
- BidiAgent is powerful but different — Bidirectional streaming agents require a fundamentally different architecture than request/response agents. You're managing concurrent async streams, not sequential calls.
- Speech-to-speech changes the UX paradigm — When the interface is voice, you have to think about interruptions (barge-in), latency perception, and how to convey structured data (like incident IDs) audibly.
- S3 Vectors simplifies RAG — Native S3 vector storage eliminates the need for a separate vector database (OpenSearch, Pinecone), reducing both cost and operational complexity.
- Spec-driven development pays off — Writing EARS-style requirements and a design doc before coding felt slow at first, but it eliminated ambiguity and made implementation straightforward. Every task was clear before we wrote a line of code.
What's next for Nova Cabin Copilot
- Multi-language support — Cabin crew on international flights need to communicate in multiple languages. Nova Sonic's multilingual capabilities could enable real-time language switching.
- Crew-to-crew relay — Let crew members in different cabin zones share incident updates by voice through the assistant, creating a shared audio log.
- Flight deck integration — Automatically escalate critical incidents to the flight deck with structured alerts, not just cabin-side logging.
- Offline mode — Cache the most critical safety procedures locally so the assistant works even without connectivity.
- Wearable deployment — Move from a browser client to a lightweight wearable (smart badge or earpiece) so crew can use it truly hands-free while moving through the cabin.
- Post-flight analytics — Aggregate incident data across flights to identify patterns (e.g., recurring turbulence injuries on specific routes) and feed insights back to safety teams.
Built With
- amazon-bedrock
- amazon-nova-sonic
- amazon-titan-embeddings-v2
- amazon-web-services
- aws-cdk
- aws-dynamodb
- aws-s3-vectors
- bedrock-knowledge-bases
- boto3
- fastapi
- javascript
- python
- strands-agents
- uvicorn
- web-audio-api
- websockets
Log in or sign up for Devpost to join the conversation.