Inspiration

Support and operations teams face two recurring problems:

  1. Incidents happen fast — but triage is slow.
  2. AI can draft responses — but often without verifiable evidence.

Large language models are powerful, but in production environments, hallucinations are unacceptable. I wanted to explore a simple but powerful idea:

What if an AI system refused to act unless it had independent evidence?

Elastic’s Agent Builder and Elasticsearch provided the perfect foundation to build an evidence-gated AI support copilot — one that combines real-time detection, hybrid search, and strict citation validation before automation.

What it does

ElasticOps Copilot is an AI-powered support automation platform built on Elasticsearch.

It provides:

  • Real-time incident detection using ES|QL spike queries
  • Intelligent ticket triage using semantic + lexical retrieval
  • Duplicate prevention via kNN vector similarity
  • Hybrid search using BM25 + vector search + Reciprocal Rank Fusion (RRF)
  • Evidence-gated automation (requires ≥2 independent citations before writing)
  • Full audit timeline for transparency and observability

Unlike typical AI tools, ElasticOps Copilot refuses to update or create a ticket unless it retrieves at least two independent citations from different indices (e.g., knowledge base + prior tickets or resolutions).

This dramatically reduces hallucinations and increases operational trust.

How we built it

ElasticOps Copilot is built on:

  • Elasticsearch Cloud
  • Elastic Agent Builder
  • ES|QL for spike detection
  • BM25 full-text ranking
  • kNN vector similarity search
  • Reciprocal Rank Fusion (RRF) to combine lexical + semantic scores
  • Next.js + Vercel frontend
  • Custom MCP server for tool integration

Real-time log detection

We use ES|QL queries such as:

FROM logs-app
| WHERE @timestamp >= NOW() - 5 minutes
| WHERE level == "ERROR"
| STATS errors = COUNT(*) BY service, env
| WHERE errors >= 40

If thresholds are exceeded, an incident is automatically created.

Hybrid retrieval architecture

We combine:

  • BM25 (lexical match)
  • kNN vector similarity
  • RRF score fusion

This allows precise keyword matches and semantic understanding to work together.

Evidence gating

Before ticket creation or update:

  • Retrieve KB articles
  • Retrieve resolutions
  • Retrieve similar tickets
  • Validate ≥2 independent citations
  • Only then allow action

All actions are recorded in an audit timeline, including:

  • Embedding step
  • Classification
  • Dedupe results
  • Retrieval results
  • Draft
  • Final action

Challenges we ran into

  • Implementing MCP server correctly for Agent Builder integration
  • Ensuring citations were truly cross-index (not same-source)
  • Balancing RRF weights between BM25 and vector similarity
  • Designing UI transparency (confidence breakdown, “Why ranked here?” panels)
  • Managing token efficiency and retrieval limits

The biggest challenge was ensuring the system did not "appear intelligent" — but was provably intelligent and evidence-backed.

Accomplishments that we're proud of

  • Fully working ES|QL spike detection pipeline
  • Hybrid search with visible RRF transparency
  • Confidence breakdown visualization (KB / Resolutions / Similarity)
  • Evidence-gated automation logic
  • Complete audit trail with run timelines
  • MCP server integration for Elastic Agent Builder
  • Production-style UI with observability and metrics dashboard

Most importantly:

The system refuses to hallucinate.

What we learned

  • Hybrid search (BM25 + kNN + RRF) is significantly stronger than either alone.
  • ES|QL enables elegant real-time operational logic.
  • Evidence gating dramatically increases AI trustworthiness.
  • Transparency (score breakdowns, timelines) is as important as accuracy.
  • Agent Builder enables structured automation workflows beyond simple chat.

We also learned how critical explainability is in production AI systems.

What's next for ElasticOps Copilot – Evidence-Gated AI Support

Next steps include:

  • SLA-aware routing
  • Incident root-cause clustering
  • Adaptive threshold detection
  • Multi-tenant support environments
  • Confidence-aware auto-escalation
  • Production observability dashboards
  • Self-healing playbooks triggered by ES|QL

The long-term vision is:

AI automation that is fast, safe, and provably grounded in search evidence.

Built With

  • bm25
  • elastic-agent-builder
  • elasticsearch-cloud
  • es|ql
  • knn-vector-search
  • mcp
  • nextjs
  • node.js
  • rrf
  • vercel
Share this project:

Updates