Civic Infrastructure Incident Response Agent

The agent explicitly reports insufficient current data instead of fabricating severity levels.
When no critical incidents are found, the agent treats the absence of data as a signal rather than assuming the system is healthy.
Based on missing recent logs, the agent recommends a time-boxed investigation rather than immediate escalation.
The agent first scans today’s logs across all services to detect any critical or high-severity civic incidents.

Inspiration

Modern cities depend on critical infrastructure such as water supply, electricity, and sanitation systems. These services generate massive volumes of operational logs every day, and operators are expected to quickly identify which signals represent real incidents versus harmless noise.

We were inspired by how experienced incident responders think: they don’t just look at a single error, they examine patterns over time, service criticality, and recovery signals before deciding what action to take. Our goal was to encode this decision-making process into an AI agent that could assist — or even automate — first-level incident triage.

What it does

Civic Infrastructure Incident Response Agent is a context-driven, multi-step AI agent that analyzes civic service logs stored in Elasticsearch and turns raw operational data into clear, actionable incident assessments.

The agent:

Retrieves relevant logs based on service, location, and time window
Detects recurring errors and abnormal behavior using ES|QL
Classifies incident severity (Low, Medium, High, Critical)
Recommends the next best action (monitor, investigate, or escalate)
Explains why a decision was made using evidence from the data

Instead of static alerts or one-off answers, the agent provides structured judgment grounded in real operational context.

How we built it

We built the agent using Elasticsearch Agent Builder, combining a reasoning model with native Elasticsearch tools.

The workflow follows four steps:

Context Retrieval using Elasticsearch Search
Pattern Detection using ES|QL for time-aware analysis
Severity Classification through agent reasoning over multiple signals
Action Recommendation with an explicit explanation

All logic runs inside the Elasticsearch ecosystem, making the solution lightweight, fast, and production-friendly. agent dynamically chooses when to search, when to analyze with ES|QL, and when to reason — instead of relying on a fixed prompt.

Challenges we ran into

One key challenge was tuning the agent’s reasoning to avoid over-escalation. Early versions were too aggressive, classifying short-lived spikes as critical incidents.

Another challenge was ensuring explainability. We had to carefully ground the agent’s responses in retrieved data so that every recommendation could be justified and trusted by human operators.

Accomplishments that we're proud of

Built a true multi-step agent, not just a prompt-based chatbot
Successfully combined Search, ES|QL, and reasoning into one workflow
Delivered consistent, explainable incident decisions
Reduced alert noise while still detecting meaningful infrastructure issues

What we learned

We learned that Agent Builder enables a powerful shift from “answering questions” to doing operational work. Designing agents that reason, choose tools, and explain outcomes requires careful instruction design — but the result is far more impactful than traditional dashboards or alerts.