-
-
The agent explicitly reports insufficient current data instead of fabricating severity levels.
-
When no critical incidents are found, the agent treats the absence of data as a signal rather than assuming the system is healthy.
-
Based on missing recent logs, the agent recommends a time-boxed investigation rather than immediate escalation.
-
The agent first scans today’s logs across all services to detect any critical or high-severity civic incidents.
Inspiration
Modern cities depend on critical infrastructure such as water supply, electricity, and sanitation systems. These services generate massive volumes of operational logs every day, and operators are expected to quickly identify which signals represent real incidents versus harmless noise.
We were inspired by how experienced incident responders think: they don’t just look at a single error, they examine patterns over time, service criticality, and recovery signals before deciding what action to take. Our goal was to encode this decision-making process into an AI agent that could assist — or even automate — first-level incident triage.
What it does
Civic Infrastructure Incident Response Agent is a context-driven, multi-step AI agent that analyzes civic service logs stored in Elasticsearch and turns raw operational data into clear, actionable incident assessments.
The agent:
- Retrieves relevant logs based on service, location, and time window
- Detects recurring errors and abnormal behavior using ES|QL
- Classifies incident severity (Low, Medium, High, Critical)
- Recommends the next best action (monitor, investigate, or escalate)
- Explains why a decision was made using evidence from the data
Instead of static alerts or one-off answers, the agent provides structured judgment grounded in real operational context.
How we built it
We built the agent using Elasticsearch Agent Builder, combining a reasoning model with native Elasticsearch tools.
The workflow follows four steps:
- Context Retrieval using Elasticsearch Search
- Pattern Detection using ES|QL for time-aware analysis
- Severity Classification through agent reasoning over multiple signals
- Action Recommendation with an explicit explanation
All logic runs inside the Elasticsearch ecosystem, making the solution lightweight, fast, and production-friendly. agent dynamically chooses when to search, when to analyze with ES|QL, and when to reason — instead of relying on a fixed prompt.
Challenges we ran into
One key challenge was tuning the agent’s reasoning to avoid over-escalation. Early versions were too aggressive, classifying short-lived spikes as critical incidents.
Another challenge was ensuring explainability. We had to carefully ground the agent’s responses in retrieved data so that every recommendation could be justified and trusted by human operators.
Accomplishments that we're proud of
- Built a true multi-step agent, not just a prompt-based chatbot
- Successfully combined Search, ES|QL, and reasoning into one workflow
- Delivered consistent, explainable incident decisions
- Reduced alert noise while still detecting meaningful infrastructure issues
What we learned
We learned that Agent Builder enables a powerful shift from “answering questions” to doing operational work. Designing agents that reason, choose tools, and explain outcomes requires careful instruction design — but the result is far more impactful than traditional dashboards or alerts.
What's next for Civic Infrastructure Incident Response Agent
Future improvements include:
- Real-time streaming ingestion from live sensors
- Correlating incidents across multiple services
- Learning from historical incidents to improve severity classification
- Integrating with ticketing or alerting systems to close the loop automatically
Built With
- elasticsearch
- elasticsearch-agent-builder
- es|ql
- kibana
- llms
Log in or sign up for Devpost to join the conversation.