Inspiration
In our real-world experience working in shared AWS environments, we observed a common pattern. A single AWS account is often shared across multiple teams. Developers spin up EC2 instances for testing, create temporary databases, launch proof-of-concept environments — and sometimes forget to shut them down. Many resources are created without proper tagging or ownership metadata.
Over time, these resources continue running unnoticed.
Only during monthly audits does the account manager or cloud operations team discover unexpected billing spikes. They manually investigate, identify idle resources, and stop or terminate them. This reactive cleanup cycle repeats month after month.
Most cost tools stop at visibility. They show dashboards. They generate alerts. But remediation remains manual, delayed, and dependent on human intervention.
We were inspired by a simple idea:
What if Elastic Agent Builder could move beyond insight and autonomously govern cloud cost?
Instead of just explaining why the bill increased, we wanted an agent that could:
- Detect idle infrastructure
- Quantify financial impact
- Enforce governance guardrails
- Safely take action
That vision led to Opra: Cloud Remediation Co-Pilot — an autonomous agent designed to continuously detect, classify, and remediate cloud waste before it becomes a billing surprise.
What it does
Opra is a multi-step AI agent built with Elastic Agent Builder that:
- Detects idle AWS resources using ES|QL tools
- Classifies resources using multi-signal inactivity rules
- Calculates projected monthly and annual cost savings
- Generates a structured remediation plan
- Executes safe remediation via workflows and runners
- Verifies state changes before reporting savings
Opra enforces strict governance controls:
- Bulk termination is not allowed
- Only HIGH confidence resources are eligible for safe bulk stop
- Explicit instance IDs are required for termination
- Tool inputs are validated against detected resources
- Post-remediation verification confirms actual savings
This transforms cost visibility into autonomous cloud governance.
How we built it
1) Data Layer
AWS metrics are ingested into Elasticsearch using Elastic integrations (EC2, EBS, RDS, Lambda).
2) Detection
We built ES|QL-powered tools that analyse sustained inactivity over a rolling 14-day window.
Resources are considered idle only when multiple independent inactivity signals confirm it:
- Low CPU utilisation
- Low network IO
- Low storage operations
- Near-zero connections (for databases)
This multi-signal approach reduces false positives.
3) Classification
Resources are categorised into:
- HIGH confidence — sustained near-zero activity
- MEDIUM confidence — low but non-zero activity
This tiering enables safe bulk remediation while requiring review for borderline cases.
4) Orchestration
Elastic Agent Builder invokes detection tools, cost enrichment tools, and remediation tools in sequence.
Elastic Workflows coordinate execution, and external runners interact securely with AWS APIs.
All execution results are written back into Elasticsearch, ensuring traceability, auditability, and post-remediation verification.
Challenges we ran into
- Preventing automatic tool invocation without explicit user approval
- Designing safe bulk remediation logic
- Enforcing strict guardrails to avoid accidental production impact
- Implementing post-remediation verification before reporting savings
- Structuring deterministic multi-step reasoning within Agent Builder
Accomplishments that we're proud of
- Built a fully multi-step autonomous FinOps agent
- Implemented strong governance guardrails to prevent unsafe execution
- Designed confidence-based remediation logic
- Enabled measurable savings calculation before and after execution
- Created a closed-loop system: Detect → Classify → Remediate → Verify
Opra doesn’t just suggest optimisations — it executes them safely.
What we learned
- Automation without guardrails is dangerous
- Multi-signal classification significantly reduces false positives
- Tool orchestration must be deterministic
- Verification is as important as execution
- Elastic Agent Builder becomes extremely powerful when combined with workflows and runners
- Elastic Observability provides the visibility layer that makes autonomous remediation safe, measurable, and continuously verifiable.
What's next for Opra: Cloud Remediation Co-Pilot
- Expand beyond AWS to multi-cloud governance
- Add anomaly detection for unexpected cost spikes
- Extend governance to tagging compliance and security misconfigurations
- Implement scheduled autonomous scans
- Introduce risk scoring and environment-aware prioritisation
Our long-term vision is to make Opra an autonomous cloud governance layer — not just for cost optimisation, but for compliance, operational efficiency, and intelligent infrastructure management.
Built With
- agent
- amazon-web-services
- apis
- boto3
- elasticsearch
- es|ql
- kibana
- observability
- python
- runner
- workflows
Log in or sign up for Devpost to join the conversation.