Inspiration
What it does
Problem
Modern SOC and SRE teams face alert fatigue. Not every alert represents a real incident, yet many monitoring systems treat all alerts equally, leading to noise, slow response, and operational overload.
Solution
The Incident Triage Agent demonstrates a real-time, production-inspired workflow that separates alerts from true incidents. Instead of escalating every signal, the agent classifies operational and security events by severity and intent before they become incidents.
How it Works
Events are continuously generated and ingested into Elasticsearch. Each event is automatically analyzed and classified into severity levels such as critical, error, warning, or info using keyword-based logic that simulates first-level incident triage.
All events are indexed for observability, while dashboards provide two distinct views:
- An operational overview of all severities
- A focused view showing only critical, high-impact events
This mirrors real SOC architectures where data ingestion is decoupled from decision-making.
Why It Matters
By separating alerts from incidents, the system reduces noise, improves response focus, and reflects how mature SOC and SRE teams operate in production environments.
Challenges & Learnings
Designing severity classification that balances simplicity and realism was the main challenge. The project reinforced the importance of decision-centric observability over raw alert volume.
How we built it
Challenges we ran into
Accomplishments that we're proud of
What we learned
What's next for Incident Triage Agent
Built With
- elastic-cloud
- elastic-lens
- elasticsearch
- observability
- python

Log in or sign up for Devpost to join the conversation.