Inspiration
Legal eDiscovery is a $15B industry where attorneys manually review thousands of documents to find the needle in the haystack — the one email that contradicts a deposition, the access log that proves someone lied under oath. A single case can take months and cost millions in billable hours.
We asked: what if a swarm of specialized AI agents could do this autonomously?
Not one general-purpose chatbot, but seven purpose-built investigators — each with their own tools, expertise, and role — coordinated through a structured investigation protocol. The kind of multi-agent orchestration that Elastic Agent Builder was designed for.
What It Does
ARGUS orchestrates 7 AI agents through a 6-phase investigation pipeline that mirrors how real legal investigation teams work:
1. Intake & Classification
A Document Classifier agent scans all evidence, tags relevance, and identifies key entities.
2. Relationship Mapping
A Relationship Mapper agent analyzes communication patterns between all parties, flagging suspicious external contacts.
3. Timeline Construction
A Timeline Builder agent constructs a chronological narrative, identifying escalation points and correlated events.
4. Pattern Detection & Sentiment Analysis (Parallel)
Two agents run simultaneously:
- Pattern Detector — Finds anomalous file access patterns (after-hours downloads, bulk transfers, deletion spikes).
- Sentiment Analyzer — Tracks behavioral shifts in communication tone.
5. Contradiction Hunting
A Contradiction Hunter agent cross-references sworn deposition testimony against documentary evidence, identifying potential perjury.
6. Synthesis & Final Report
A Lead Investigator agent reviews all findings and produces a structured legal report with case strength assessment and recommended actions.
Data Interaction
Each agent uses ES|QL queries and Index Search tools to interrogate real Elasticsearch data:
- Emails
- File access logs
- Chat messages
- Depositions
- Contracts
- Personnel records
Results stream in real-time to a dashboard featuring:
- Relationship network graph
- Evidence timeline
- Contradiction panel
- Live agent activity feed
Demo Case Results
Case: NovaTech v. Marcus Chen — suspected IP theft
ARGUS Output (≈10 minutes):
- 69 findings
- 12 mapped relationships
- 16 key timeline events
- 4 material contradictions between testimony and documents
How We Built It
Backend
- FastAPI server orchestrating agents via the Agent Builder Converse API.
- Agents deployed with specialized system instructions and curated toolsets.
- Structured JSON parsing and typed WebSocket event emission:
relationship_foundtimeline_eventcontradiction_foundanomaly_detected
Agents & Tools
- 7 agents
- 30 tools total
- 24 ES|QL tools for structured queries
- 6 Index Search tools for semantic evidence discovery
Frontend
- Next.js with real-time WebSocket streaming
- Force-directed network graph visualization
- Live-updating timeline, evidence board, and contradiction panels
- Automatic final report generation with case strength scoring
Data Layer
Six Elasticsearch indices containing:
- Emails
- Documents
- File access logs
- Chat messages
- Depositions
- People directory
Synthetic but realistic dataset modeling a corporate IP theft scenario:
- 750+ emails
- 2000+ file access records
- Depositions
- Internal documents
Challenges We Ran Into
Multi-Agent Orchestration Is a State Management Nightmare
Coordinating 7 agents across 6 phases — with Phase 4 running two agents in parallel — meant solving real concurrency problems. Each agent maintains its own conversation state via the Converse API, emits different event types (relationship_found, timeline_event, contradiction_found, anomaly_detected), and writes findings back to Elasticsearch while simultaneously streaming to the frontend via WebSocket. Race conditions between parallel agents writing to the same findings index, conversation ID management across API calls, and ensuring the frontend state machine correctly handles interleaved events from concurrent agents required careful architectural decisions. We couldn't just fire-and-forget — later phases depend on earlier agents having indexed their findings.
Structured Output From Unstructured Reasoning
The fundamental tension: agents need freedom to reason deeply over evidence, but the dashboard needs typed, structured data. We couldn't use rigid output schemas without crippling the agents' investigative reasoning. Our solution was a dual-output protocol — agents produce rich analytical narratives and then emit a structured JSON block at the end. But getting this reliable across 7 different agents with different output schemas (relationships need source/target/link_type, contradictions need claim/evidence_against/deposition_ref, anomalies need metric_value/metric_unit) required iterating on prompt engineering, building a multi-strategy JSON extraction pipeline, and implementing graceful fallbacks that still populate the dashboard when an agent deviates from format.
Entity Resolution Across Independent Agents
Each agent independently discovers and references people — the Relationship Mapper finds "Marcus Chen" in email headers, the Pattern Detector sees "marcus-chen" in file access logs, and the Contradiction Hunter reads "Mr. Chen" in depositions. Without a shared entity resolution layer, the network graph fragments into disconnected nodes for the same person. We built a normalization pipeline that handles slug conversion, partial name matching, and last-name fallback to maintain a consistent entity graph across all agents — critical because the entire value proposition of multi-agent investigation is connecting findings across agents.
Designing the Investigation Protocol Itself
The hardest challenge wasn't code — it was designing which agents exist, what tools each one gets, what order they run in, and what runs in parallel. Give an agent too many tools and it wastes turns exploring irrelevant data. Give it too few and it can't cross-reference evidence. We went through multiple iterations of the phase structure before landing on the current 6-phase pipeline, where each phase's output becomes the next phase's context. The decision to run Pattern Detection and Sentiment Analysis in parallel (Phase 4) while keeping Contradiction Hunting sequential (Phase 5, after all evidence is indexed) was a deliberate architectural choice — contradictions require all prior findings to be queryable.
Accomplishments We’re Proud Of
- Zero simulation, zero hardcoded data — All findings come from real agents querying real indices.
- 100% JSON parse rate — Every agent response produces structured dashboard data.
- Parallel agent execution — True concurrency in Phase 4.
- Genuinely useful investigations — Real coordinated patterns and contradictions discovered.
What We Learned
Agent Builder excels at multi-agent architectures. The combination of:
- ES|QL tools → precision
- Index Search tools → exploration
…gives agents both surgical accuracy and discovery capability.
Key Insight: Don’t fight the LLM’s natural format. Let agents produce rich markdown, then extract structured JSON blocks. You get human-readable reports and machine-parseable data.
What’s Next for ARGUS
General-Purpose Case Support
Dynamic prompt generation from case metadata to support any investigation type.
Real Document Ingestion
Upload PDFs, CSVs, PSTs, XLSX files — automatically parsed and indexed.
Agent Memory Across Phases
Later agents receive earlier summaries, building cumulative investigative context.
Confidence Calibration
Track corroborated findings across agents to build evidence chain scoring.
Built With
- asyncio
- d3.js
- elastic-agent-builder
- elasticsearch
- es|ql
- fastapi
- framer-motion
- httpx
- kibana
- next.js
- python
- react
- tailwind-css
- typescript
- websocket
Log in or sign up for Devpost to join the conversation.