Research Orchestration System

Web-app UI
Web-app Results
Slack Interface
References

Research Orchestration System — Multi-Agent Literature Review and Claim Verification

Synthesizing findings across dozens of papers into a literature review takes researchers weeks. Cross-checking whether a specific claim holds up across the broader literature is equally time-consuming. The Research Orchestration System automates both tasks using three multi-step agents built using Elastic Agent Builder. These are specialized agents that search, synthesise, review, and verify each other's work. Currently running on ~200 Agentic AI papers (~5,000 full-text chunks) indexed in Elasticsearch, the system is corpus-agnostic. The included indexing pipeline converts any collection of PDFs into searchable, embedded chunks.

Research Agent

Runs a six-step pipeline:

Plans sub-questions from user's query
Scopes the corpus using ES|QL analytics
Identifies key papers by citation count
Retrieves evidence using hybrid keyword + semantic search across full-text chunks
Cross-checks findings for contradictions by running targeted searches for opposing evidence
Synthesizes everything into a structured literature review with inline citations and confidence tags ([SUPPORTED], [CONTESTED], [INSUFFICIENT])

Review Agent

Evaluates research draft through seven verification steps:

Checks structural completeness
Batch-verifies all references exist via ES|QL
Audits confidence tags by verifying each claim
Spot-checks quantitative claims against source text
Identifies missing high-impact papers by comparing cited references against the most cited papers for the topic
Validates contradictions by independently searching corpus
Issues final verdict: 'PASS' or 'REVISION_NEEDED'

The Research Agent and Review Agent operate in an orchestrated loop; if the reviewer finds draft unsatisfactory, it sends back specific, actionable feedback, and the Research Agent revises accordingly, up to two iterations.

Claim Verification Agent

Evaluates a specific claim against the corpus through a five-step pipeline:

Parses the claim into testable statements
Finds relevant papers using search queries with varied terminology
Gathers evidence and classifies each excerpt as SUPPORTS, CONTRADICTS, or QUALIFIES
Assesses nuances by searching for methodological differences and scope limitations
Produces structured verdict with confidence level

Each agent uses 5 custom tools (2 index searches, 3 ES|QL tools) plus default platform tools. A FastAPI backend orchestrates the agent loop, streaming real-time reasoning traces via SSE. Accessible through three interfaces: a React web app, Slack (/research, /check-claim), and Claude Code via MCP.

Features used: Elastic Agent Builder, custom index search tools, custom ES|QL tools, platform tools (execute_esql, search, get_document_by_id), converse streaming API, MCP.

What I liked: ES|QL tools were powerful for structured analytics like publication trends and batch reference verification, complementing the unstructured search tools well. The converse streaming API made it straightforward to build a real-time reasoning trace UI that builds trust in the output.

Challenge: Getting the Review Agent to verify citations efficiently. Initially, the reviewer made individual Elasticsearch queries for each reference, 15+ tool calls just for citation checking. Restructuring this into a single batch ES|QL query with the orchestrator pre-extracting paper_ids from the draft reduced verification to one tool call, cutting review time significantly.

Built With

elastic
elastic-agent-builder
elasticsearch
fastapi
google-cloud
netlify
python

Updates

Aaryan Kandiah started this project — Feb 27, 2026 08:07 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.