Fleet commander + Kibana
Fleet of coding agents in action
Fleet commander helps break down tasks for agents

CodeFleet

AI Coding Agent Fleet Commander — orchestrate parallel Claude Code agents with Elasticsearch as the orchestration platform, not just the database.

🔗 GitHub Repository | 📹 Demo Video | License: Apache 2.0

The Problem: AI Coding Agents Don't Scale Without Orchestration

Running one AI coding agent is powerful. Running five in parallel is chaos.

Developers increasingly use AI agents like Claude Code, Cursor, and Codex to write production code. But the moment you try to run more than one agent on the same codebase, everything breaks down:

No task queue. You manually tell each agent what to do in a separate terminal.
No conflict detection. Two agents can silently edit the same file and create merge conflicts you don't discover until later.
No monitoring. You have no idea what each agent is working on, how long it's been running, or when it finishes.
No auto-assignment. When an agent completes its task, it sits idle until you notice and give it more work.
No dependency awareness. You mentally track what's blocked by what, and manually trigger the next phase.

Without an orchestration layer, the human becomes the message bus — relaying completion signals between agents, merging branches, copying config files, assigning next tasks. It works, but only because a person is doing the coordination full-time.

Overview of our solution: CodeFleet

CodeFleet automates everything that human was doing — and the orchestration logic itself runs natively in Elasticsearch via Elastic Workflows.

three_panels

CodeFleet works through 3 simple steps:

Step 1 (Chat with Fleet Commander): You describe what you want to build in natural language — through Kibana's Agent Builder chat UI, Claude Desktop via MCP, or the CLI. The Fleet Commander agent (built with Elastic Agent Builder) breaks your request into tasks with priorities, dependencies, and file scope, and writes them to Elasticsearch via a Workflow tool. Think of it as your AI PM.
Step 2 (Elastic Workflows Orchestrate): This is where Elasticsearch becomes more than a database. Three Elastic Workflows handle the orchestration logic — entirely server-side, no local code needed:
- Auto-Assignment Workflow runs on a schedule, matches pending unblocked tasks to idle agents, and assigns them.
- Task Completion Workflow fires when a runner marks a task complete, automatically unblocks dependent tasks by removing the completed task from their blocked_by lists, and moves newly-unblocked tasks to pending.
- Stale Agent Workflow runs on a schedule, detects runners that haven't sent heartbeats, and re-queues their tasks.

Local runners simply poll Elasticsearch for their assignments, execute code via Claude Agent SDK, and report results back to ES — which triggers the next workflow cycle.

Step 3 (Monitor Everything in Real-Time): Every agent event — task started, file changed, task completed, conflict detected — is indexed in Elasticsearch. Kibana Discover gives you a real-time activity stream. Fleet Commander can answer questions like "any conflicts?" or "what's the status?" by querying ES directly through its ES|QL tools.

Without CodeFleet: What an Engineer Does Today

To understand why Elasticsearch is the right backbone for this, consider what you'd have to build or do manually without it:

What You Do Today (Manual)	What CodeFleet + Elastic Provides
Open 3-5 terminal tabs, copy-paste kickoff prompts to each agent	Runners poll ES for assignments automatically; Auto-Assignment Workflow matches them to tasks
Mentally track which tasks are blocked by which	`depends_on` fields in ES + Elastic Workflow auto-unblocks when tasks complete
Relay "agent 1 is done" messages between agents	Task Completion Workflow fires automatically, unblocks dependents
Manually assign the next batch of work	Auto-Assignment Workflow matches idle agents to pending tasks every 5 seconds
Notice and fix interface conflicts after the fact	ES\|QL aggregation detects overlapping file edits in real-time
Say "start Phase 2" when you think everything's ready	Workflows cascade: complete → unblock → assign → execute, all automatic
Build a React dashboard for monitoring	Kibana dashboards — real-time, zero frontend code
Build a chat UI + tool-calling framework	Agent Builder — chat, tools, MCP, all included

Why Elasticsearch and Agent Builder? (Not Just a Database)

The key insight behind CodeFleet is that Elasticsearch isn't just storage — it's the orchestration platform.

Most AI agent projects use a database to store state and write orchestration logic in Python/Node. CodeFleet pushes that orchestration into Elastic itself:

What Elastic Provides	What We'd Build Without It
Agent Builder — Fleet Commander with AI reasoning	Custom chat frontend + tool-calling framework
ES\|QL tools — 7 expert-defined queries as agent tools	Build parameter routing + query engine from scratch
Elastic Workflows — auto-assignment, dependency unblocking, health checks	Write a custom job scheduler + state machine in Python
Semantic search — built-in embeddings for duplicate detection	External embedding API + vector DB setup
Kibana Discover — real-time activity stream, zero frontend	Build a log viewer + monitoring dashboard
MCP server — IDE integration out of the box	Build an MCP server from scratch

The orchestration brain lives in Elastic. The local machine only needs thin runners that poll for assignments and execute code. If you replace Elasticsearch with Postgres, you'd need to rebuild the workflow engine, the query tools, the semantic search, the monitoring, and the chat interface — from scratch.

How It Works (Technical Detail)

architecture

#	Component	Technology	Role
①	User Interfaces	Kibana Chat, Claude Desktop (MCP), CLI	How you interact with the fleet
②	Fleet Commander	Elastic Agent Builder + 8 tools	AI PM: searches backlog, creates tasks, detects conflicts, reviews work
③	Elastic Workflows	4 YAML-defined workflows	Orchestration engine: auto-assign, unblock deps, create tasks, health checks
④	Elasticsearch	ES Serverless (5 indices)	Shared brain: tasks, agents, activity, file changes, conflicts
⑤	Semantic Search	`.multilingual-e5-small-elasticsearch`	Built-in embeddings for duplicate task detection — no external API
⑥	Kibana	Discover + Dashboards	Real-time fleet status, activity stream, event exploration
⑦	Runners	Claude Agent SDK (Python)	Engineers: poll ES for assignments, execute coding tasks, report back
⑧	Your Codebase	Local filesystem	Each runner writes code, reports file changes to ES

Data flow:

You tell Fleet Commander what to build (via Kibana, MCP, or CLI)
Fleet Commander breaks it into tasks with dependencies and creates them in ES via the create_task Workflow
Auto-Assignment Workflow fires every 5 seconds — finds pending unblocked tasks, matches them to idle runners, assigns them
Runners poll ES, find their assignments, launch Claude Agent SDK sessions, write code
During execution, runners log every event and file change to ES in real-time
On completion, runner updates task status → Task Completion Workflow fires automatically, unblocks dependents, moves them to pending
Cycle repeats: Auto-Assignment Workflow picks up the newly-pending tasks and assigns them to idle runners
Stale Agent Workflow runs every 60 seconds, re-queues tasks from crashed runners
Fleet Commander can answer status questions, reassign work, or review completed tasks — anytime via ES|QL tools

Elastic Workflows: Where the Orchestration Lives

The orchestration isn't in Python — it's in Elastic. Here are the 4 workflows that power CodeFleet:

Workflow	Trigger	What It Does
`create_task`	Manual (called by Fleet Commander)	Indexes a new task document with all fields including semantic embeddings for duplicate detection
`auto_assign_tasks`	Scheduled (every 5s)	Queries for pending unblocked tasks and idle agents, assigns tasks to agents, updates both documents
`handle_task_completion`	Alert (task status → completed)	Finds all tasks that depend on the completed one, removes it from their `blocked_by` lists, sets fully-unblocked tasks to `pending`
`handle_stale_agents`	Scheduled (every 60s)	Finds agents with stale heartbeats (>2 min), marks them offline, re-queues their assigned tasks back to `pending`

ES|QL Tools: Where the Intelligence Lives

The Fleet Commander isn't just a chatbot — it's an agent whose reasoning is grounded in real data through ES|QL tools. Here are the 8 tools that power it:

Tool	What It Does	ES\|QL Pattern
`search_backlog`	Find highest-priority unblocked tasks	Filter by status, sort by priority, exclude blocked
`check_agent_status`	See what all agents are doing right now	Aggregate agents by status (idle/working/offline)
`detect_conflicts`	Find files modified by multiple agents	Aggregate file changes, filter count > 1 within time window
`assign_task`	Look up task details for assignment verification	Query task by ID, verify status before assignment
`review_completed`	Review recently completed work	Filter by status=completed, include cost and duration metrics
`find_similar_tasks`	Semantic search for duplicate/related tasks	MATCH query on `semantic_text` embeddings (built-in `.multilingual-e5-small-elasticsearch`)
`create_task`	Create a new task in the backlog	Workflow tool — indexes task with semantic field for vector search
`analyze_dependencies`	Check what's blocked by what	Filter on `depends_on` fields, identify dependency chains

This is the pattern we love about Agent Builder: business logic lives in expert-defined ES|QL queries, not in LLM hallucinations. The agent reasons about what to do, but the data it reasons over is precise, real-time, and grounded in Elasticsearch.

Agent Builder Features Used

Feature	How We Use It
Custom Agents	Fleet Commander with specialized fleet orchestration instructions
ES\|QL Tools	7 parameterized tools for backlog search, agent status, conflict detection, semantic search, dependency analysis, task lookup, work review
Elastic Workflows	4 workflows: task creation, auto-assignment, dependency unblocking, stale agent recovery
Semantic Search	`semantic_text` field with built-in `.multilingual-e5-small-elasticsearch` inference — no external embedding API
MCP Server	Fleet Commander accessible from Claude Desktop — natural language fleet management from your IDE
Kibana Discover	Real-time activity stream: every agent event indexed and explorable

Challenges and Things We Liked

Things we liked:

The ES|QL parameterized tool pattern is production-grade. In most AI agent frameworks, the agent writes its own queries — which means hallucinated SQL, wrong table names, off-by-one filters. With Agent Builder, the queries are written by a domain expert and the agent just fills in parameters. This is how production AI should work, and it's the single biggest differentiator we've seen in any agent framework.
Elastic Workflows let us move orchestration to the cloud. The auto-assignment and dependency unblocking logic that would normally be a complex Python state machine is instead 3 declarative YAML files running server-side in Elastic. When a task completes, the unblocking workflow fires automatically — no local daemon needed for that logic. This means the "brain" runs in Elastic and only the "hands" (Claude Agent SDK runners) need to run locally.
Built-in semantic search was a game-changer. The semantic_text field type with .multilingual-e5-small-elasticsearch gave us vector search for duplicate task detection with zero external dependencies. No OpenAI embeddings API, no Pinecone, no vector DB setup. Just a field type in the mapping and it works.

Challenges:

Workflow string handling required coercion logic. Elastic Workflows pass all inputs as strings, but our Pydantic models expect typed lists for fields like depends_on and file_scope. We added field validators to coerce CSV strings into lists, and patched our ES search layer to normalize these fields on read. Small friction, but it shows Workflows are still maturing.
ES|QL string parameters can't be used in timespan expressions. Our review_completed tool needed restructuring because you can't parameterize NOW() - ?since. We also discovered that Agent Builder's param type validation requires integer instead of number — small things that cost real debugging time.

What's Next

Feature	Status
Task queue with dependencies, priorities, file scope	✅ Shipped
Fleet Commander with 8 tools (7 ES\|QL + 1 workflow)	✅ Shipped
Claude Agent SDK runners with polling + execution	✅ Shipped
Elastic Workflows: auto-assign, unblock deps, stale recovery	✅ Shipped
Conflict detection tool via ES aggregations	✅ Shipped
Semantic search via built-in Elastic Inference embeddings	✅ Shipped
MCP integration (Claude Desktop)	✅ Shipped
Kibana Discover for real-time event exploration	✅ Shipped
Dynamic runner scaling (0→N based on queue)	✅ Shipped
Automatic dependency unblocking via Workflow	✅ Shipped
Fleet Commander task creation via Workflow tool	✅ Shipped
A2A protocol for cross-framework agent communication	🔜 Next
Git automation (worktree creation, branch merge, PR)	🔜 Next
Custom Kibana dashboards with panels	🔜 Next
Multi-model routing (Opus for complex, Sonnet for simple)	📋 Planned
GitHub/Linear integration for importing backlogs	📋 Planned

CodeFleet is open source (Apache 2.0) and designed to run headlessly on always-on machines. It's not a demo — it's a tool built for real AI coding workflows, with Elasticsearch as the orchestration platform.

🔗 GitHub Repository | 📹 Demo Video