Inspiration
Incident response is the worst time to type. Your hands are already on the keyboard, you're flipping between dashboards and logs and runbooks, and the clock's running. The thing that would actually help, knowing how you fixed this exact problem last time, is usually stuck in someone's head or a Slack thread you can't find. We wanted infra you could just talk to that remembers what happened before.
What it does
OnCall is a voice copilot for on-call engineers. You ask it out loud what's wrong and it answers back in voice. Ask "what's broken in prod?" and instead of just reading you a number, it'll say something like "gateway DB is near its connection limit. Last time this happened you bumped max_connections to 200 and added a PgBouncer pooler. Want the same fix?" So it does the recall, the diagnosis, and a suggested fix without you touching the keyboard.
How we built it
Voice is Deepgram's Voice Agent API, STT, turn-taking, TTS. We fed it our ops vocab (Lambda, Cognito, PgBouncer, max_connections, ARNs) so it stops mangling jargon. Memory is Redis, used three ways past caching: agent memory for incident history across sessions, vector search over a corpus of runbooks and postmortems, and semantic caching so repeat questions skip the LLM call. Running on Redis Cloud. The agent itself we built with Claude Code. It calls a few ops tools (log query, metric lookup, fix proposal) against a harness we seeded with real incident data, including an actual Postgres connection-exhaustion outage we dealt with. Flow: voice → agent → Redis → voice. Diagram below.
Challenges we ran into
Voice latency was the big one, getting it fast enough that it doesn't feel like a walkie-talkie. STT kept choking on ops terms until we tuned it. And figuring out a memory schema that separates "what's happening right now this session" from "what we've learned over time" took a few tries.
Accomplishments that we're proud of
The voice actually matters here, it's not a button we bolted on, you genuinely can't type this fast mid-incident. And the memory recall works. Watching it pull up a specific past outage and suggest the fix that worked got a real reaction from people who tried it.
What we learned
Redis past caching is a bigger deal than we expected, adding memory and vector search turned a stateless bot into something that actually accumulates knowledge. And building for voice forced us to be way more disciplined. A bot that talks out loud can't hide behind a wall of text, it has to actually know the answer.
What's next for OnCall
Wiring up live AWS (CloudWatch, Lambda, RDS) behind the same interface, letting it actually run fixes with a human approving each step, and shared team memory so the whole rotation inherits one brain instead of everyone learning the same lessons separately.
Log in or sign up for Devpost to join the conversation.