Inspiration
Mean Time To Remediate (MTTR) is the silent killer of engineering velocity. We noticed SREs spend hours debugging CI/CD failures that are often repetitive—disk exhaustion, transient network glitches, or version mismatches. Existing AI solutions are chatbots that explain errors. We wanted an agent that routes them deterministically.
What it does
BuildMedic is a semantic failure router, not a chatbot. It ingests CI/CD failure logs, semantically matches them against a policy store of runbooks using Elasticsearch, and outputs structured JSON commands (retry, scale, rollback). It gates actions on confidence scores, ensuring only high-certainty remediations are triggered. Every decision is audited for feedback loops.
How we built it
- Core: Elasticsearch Agent Builder with semantic_text fields powered by ELSER-2.
- Architecture: Three-index pattern: buildmedic-logs (ingestion), buildmedic-runbooks (policy brain), buildmedic-history (audit trail).
- Agent Logic: Custom system prompt enforcing strict JSON output. No natural language chatter.
- Retrieval: Hybrid semantic search matching error messages to symptom descriptions with confidence thresholding. ## Challenges we ran into Hallucination Control: Forcing the agent to output only valid JSON without markdown wrapping required iterative prompt engineering. Statelessness: Elastic Agents are stateless. We fake memory by logging every decision to buildmedic-history, enabling auditability and future feedback loops. Semantic Precision: Balancing semantic similarity with keyword filters to avoid false positives on critical infrastructure actions. ## Accomplishments that we're proud of Deterministic Classification: Achieved >0.85 confidence on known failure patterns (infra, transient, regression). Safety First: Agent never executes directly; it outputs recommendations. Low-confidence matches (<0.75) are routed to null for human review. Zero-Code Policy Updates: Adding a new failure pattern requires only inserting a new runbook document—no code deployment. ## What we learned Agents as Routers: LLMs are most reliable in production when used as classifiers/routers, not direct executors. Semantic Text Simplicity: ES semantic_text abstracts away vector pipeline complexity, making RAG trivial to implement. Audit Trails are Mandatory: You cannot trust AI decisions without a immutable log of why a decision was made. ## What's next for BuildMedic Adaptive Learning: Automatically adjust confidence_threshold based on remediation success rates logged in buildmedic-history. Real Integrations: Connect webhook tools to GitHub Actions/GitLab APIs for actual auto-remediation. MTTR Dashboard: Kibana Lens visualization tracking time-to-fix before vs. after BuildMedic deployment.
Built With
- agent
- elastic
- elasticsearch
- elser-2
- esearch
- esql
- github
- json
- kibana
- webhooks


Log in or sign up for Devpost to join the conversation.