CodeFleet

AI Coding Agent Fleet Commander — orchestrate parallel Claude Code agents with Elasticsearch as the orchestration platform, not just the database.

🔗 GitHub Repository | 📹 Demo Video | License: Apache 2.0


The Problem: AI Coding Agents Don't Scale Without Orchestration

Running one AI coding agent is powerful. Running five in parallel is chaos.

Developers increasingly use AI agents like Claude Code, Cursor, and Codex to write production code. But the moment you try to run more than one agent on the same codebase, everything breaks down:

  • No task queue. You manually tell each agent what to do in a separate terminal.
  • No conflict detection. Two agents can silently edit the same file and create merge conflicts you don't discover until later.
  • No monitoring. You have no idea what each agent is working on, how long it's been running, or when it finishes.
  • No auto-assignment. When an agent completes its task, it sits idle until you notice and give it more work.
  • No dependency awareness. You mentally track what's blocked by what, and manually trigger the next phase.

Without an orchestration layer, the human becomes the message bus — relaying completion signals between agents, merging branches, copying config files, assigning next tasks. It works, but only because a person is doing the coordination full-time.


Overview of our solution: CodeFleet

CodeFleet automates everything that human was doing — and the orchestration logic itself runs natively in Elasticsearch via Elastic Workflows.

three_panels

CodeFleet works through 3 simple steps:

  • Step 1 (Chat with Fleet Commander): You describe what you want to build in natural language — through Kibana's Agent Builder chat UI, Claude Desktop via MCP, or the CLI. The Fleet Commander agent (built with Elastic Agent Builder) breaks your request into tasks with priorities, dependencies, and file scope, and writes them to Elasticsearch via a Workflow tool. Think of it as your AI PM.

  • Step 2 (Elastic Workflows Orchestrate): This is where Elasticsearch becomes more than a database. Three Elastic Workflows handle the orchestration logic — entirely server-side, no local code needed:

    • Auto-Assignment Workflow runs on a schedule, matches pending unblocked tasks to idle agents, and assigns them.
    • Task Completion Workflow fires when a runner marks a task complete, automatically unblocks dependent tasks by removing the completed task from their blocked_by lists, and moves newly-unblocked tasks to pending.
    • Stale Agent Workflow runs on a schedule, detects runners that haven't sent heartbeats, and re-queues their tasks.

Local runners simply poll Elasticsearch for their assignments, execute code via Claude Agent SDK, and report results back to ES — which triggers the next workflow cycle.

  • Step 3 (Monitor Everything in Real-Time): Every agent event — task started, file changed, task completed, conflict detected — is indexed in Elasticsearch. Kibana Discover gives you a real-time activity stream. Fleet Commander can answer questions like "any conflicts?" or "what's the status?" by querying ES directly through its ES|QL tools.

Without CodeFleet: What an Engineer Does Today

To understand why Elasticsearch is the right backbone for this, consider what you'd have to build or do manually without it:

What You Do Today (Manual) What CodeFleet + Elastic Provides
Open 3-5 terminal tabs, copy-paste kickoff prompts to each agent Runners poll ES for assignments automatically; Auto-Assignment Workflow matches them to tasks
Mentally track which tasks are blocked by which depends_on fields in ES + Elastic Workflow auto-unblocks when tasks complete
Relay "agent 1 is done" messages between agents Task Completion Workflow fires automatically, unblocks dependents
Manually assign the next batch of work Auto-Assignment Workflow matches idle agents to pending tasks every 5 seconds
Notice and fix interface conflicts after the fact ES|QL aggregation detects overlapping file edits in real-time
Say "start Phase 2" when you think everything's ready Workflows cascade: complete → unblock → assign → execute, all automatic
Build a React dashboard for monitoring Kibana dashboards — real-time, zero frontend code
Build a chat UI + tool-calling framework Agent Builder — chat, tools, MCP, all included

Why Elasticsearch and Agent Builder? (Not Just a Database)

The key insight behind CodeFleet is that Elasticsearch isn't just storage — it's the orchestration platform.

Most AI agent projects use a database to store state and write orchestration logic in Python/Node. CodeFleet pushes that orchestration into Elastic itself:

What Elastic Provides What We'd Build Without It
Agent Builder — Fleet Commander with AI reasoning Custom chat frontend + tool-calling framework
ES|QL tools — 7 expert-defined queries as agent tools Build parameter routing + query engine from scratch
Elastic Workflows — auto-assignment, dependency unblocking, health checks Write a custom job scheduler + state machine in Python
Semantic search — built-in embeddings for duplicate detection External embedding API + vector DB setup
Kibana Discover — real-time activity stream, zero frontend Build a log viewer + monitoring dashboard
MCP server — IDE integration out of the box Build an MCP server from scratch

The orchestration brain lives in Elastic. The local machine only needs thin runners that poll for assignments and execute code. If you replace Elasticsearch with Postgres, you'd need to rebuild the workflow engine, the query tools, the semantic search, the monitoring, and the chat interface — from scratch.


How It Works (Technical Detail)

architecture

# Component Technology Role
User Interfaces Kibana Chat, Claude Desktop (MCP), CLI How you interact with the fleet
Fleet Commander Elastic Agent Builder + 8 tools AI PM: searches backlog, creates tasks, detects conflicts, reviews work
Elastic Workflows 4 YAML-defined workflows Orchestration engine: auto-assign, unblock deps, create tasks, health checks
Elasticsearch ES Serverless (5 indices) Shared brain: tasks, agents, activity, file changes, conflicts
Semantic Search .multilingual-e5-small-elasticsearch Built-in embeddings for duplicate task detection — no external API
Kibana Discover + Dashboards Real-time fleet status, activity stream, event exploration
Runners Claude Agent SDK (Python) Engineers: poll ES for assignments, execute coding tasks, report back
Your Codebase Local filesystem Each runner writes code, reports file changes to ES

Data flow:

  1. You tell Fleet Commander what to build (via Kibana, MCP, or CLI)
  2. Fleet Commander breaks it into tasks with dependencies and creates them in ES via the create_task Workflow
  3. Auto-Assignment Workflow fires every 5 seconds — finds pending unblocked tasks, matches them to idle runners, assigns them
  4. Runners poll ES, find their assignments, launch Claude Agent SDK sessions, write code
  5. During execution, runners log every event and file change to ES in real-time
  6. On completion, runner updates task status → Task Completion Workflow fires automatically, unblocks dependents, moves them to pending
  7. Cycle repeats: Auto-Assignment Workflow picks up the newly-pending tasks and assigns them to idle runners
  8. Stale Agent Workflow runs every 60 seconds, re-queues tasks from crashed runners
  9. Fleet Commander can answer status questions, reassign work, or review completed tasks — anytime via ES|QL tools

Elastic Workflows: Where the Orchestration Lives

The orchestration isn't in Python — it's in Elastic. Here are the 4 workflows that power CodeFleet:

Workflow Trigger What It Does
create_task Manual (called by Fleet Commander) Indexes a new task document with all fields including semantic embeddings for duplicate detection
auto_assign_tasks Scheduled (every 5s) Queries for pending unblocked tasks and idle agents, assigns tasks to agents, updates both documents
handle_task_completion Alert (task status → completed) Finds all tasks that depend on the completed one, removes it from their blocked_by lists, sets fully-unblocked tasks to pending
handle_stale_agents Scheduled (every 60s) Finds agents with stale heartbeats (>2 min), marks them offline, re-queues their assigned tasks back to pending

ES|QL Tools: Where the Intelligence Lives

The Fleet Commander isn't just a chatbot — it's an agent whose reasoning is grounded in real data through ES|QL tools. Here are the 8 tools that power it:

Tool What It Does ES|QL Pattern
search_backlog Find highest-priority unblocked tasks Filter by status, sort by priority, exclude blocked
check_agent_status See what all agents are doing right now Aggregate agents by status (idle/working/offline)
detect_conflicts Find files modified by multiple agents Aggregate file changes, filter count > 1 within time window
assign_task Look up task details for assignment verification Query task by ID, verify status before assignment
review_completed Review recently completed work Filter by status=completed, include cost and duration metrics
find_similar_tasks Semantic search for duplicate/related tasks MATCH query on semantic_text embeddings (built-in .multilingual-e5-small-elasticsearch)
create_task Create a new task in the backlog Workflow tool — indexes task with semantic field for vector search
analyze_dependencies Check what's blocked by what Filter on depends_on fields, identify dependency chains

This is the pattern we love about Agent Builder: business logic lives in expert-defined ES|QL queries, not in LLM hallucinations. The agent reasons about what to do, but the data it reasons over is precise, real-time, and grounded in Elasticsearch.


Agent Builder Features Used

Feature How We Use It
Custom Agents Fleet Commander with specialized fleet orchestration instructions
ES|QL Tools 7 parameterized tools for backlog search, agent status, conflict detection, semantic search, dependency analysis, task lookup, work review
Elastic Workflows 4 workflows: task creation, auto-assignment, dependency unblocking, stale agent recovery
Semantic Search semantic_text field with built-in .multilingual-e5-small-elasticsearch inference — no external embedding API
MCP Server Fleet Commander accessible from Claude Desktop — natural language fleet management from your IDE
Kibana Discover Real-time activity stream: every agent event indexed and explorable

Challenges and Things We Liked

Things we liked:

  1. The ES|QL parameterized tool pattern is production-grade. In most AI agent frameworks, the agent writes its own queries — which means hallucinated SQL, wrong table names, off-by-one filters. With Agent Builder, the queries are written by a domain expert and the agent just fills in parameters. This is how production AI should work, and it's the single biggest differentiator we've seen in any agent framework.

  2. Elastic Workflows let us move orchestration to the cloud. The auto-assignment and dependency unblocking logic that would normally be a complex Python state machine is instead 3 declarative YAML files running server-side in Elastic. When a task completes, the unblocking workflow fires automatically — no local daemon needed for that logic. This means the "brain" runs in Elastic and only the "hands" (Claude Agent SDK runners) need to run locally.

  3. Built-in semantic search was a game-changer. The semantic_text field type with .multilingual-e5-small-elasticsearch gave us vector search for duplicate task detection with zero external dependencies. No OpenAI embeddings API, no Pinecone, no vector DB setup. Just a field type in the mapping and it works.

Challenges:

  1. Workflow string handling required coercion logic. Elastic Workflows pass all inputs as strings, but our Pydantic models expect typed lists for fields like depends_on and file_scope. We added field validators to coerce CSV strings into lists, and patched our ES search layer to normalize these fields on read. Small friction, but it shows Workflows are still maturing.

  2. ES|QL string parameters can't be used in timespan expressions. Our review_completed tool needed restructuring because you can't parameterize NOW() - ?since. We also discovered that Agent Builder's param type validation requires integer instead of number — small things that cost real debugging time.


What's Next

Feature Status
Task queue with dependencies, priorities, file scope ✅ Shipped
Fleet Commander with 8 tools (7 ES|QL + 1 workflow) ✅ Shipped
Claude Agent SDK runners with polling + execution ✅ Shipped
Elastic Workflows: auto-assign, unblock deps, stale recovery ✅ Shipped
Conflict detection tool via ES aggregations ✅ Shipped
Semantic search via built-in Elastic Inference embeddings ✅ Shipped
MCP integration (Claude Desktop) ✅ Shipped
Kibana Discover for real-time event exploration ✅ Shipped
Dynamic runner scaling (0→N based on queue) ✅ Shipped
Automatic dependency unblocking via Workflow ✅ Shipped
Fleet Commander task creation via Workflow tool ✅ Shipped
A2A protocol for cross-framework agent communication 🔜 Next
Git automation (worktree creation, branch merge, PR) 🔜 Next
Custom Kibana dashboards with panels 🔜 Next
Multi-model routing (Opus for complex, Sonnet for simple) 📋 Planned
GitHub/Linear integration for importing backlogs 📋 Planned

CodeFleet is open source (Apache 2.0) and designed to run headlessly on always-on machines. It's not a demo — it's a tool built for real AI coding workflows, with Elasticsearch as the orchestration platform.

🔗 GitHub Repository | 📹 Demo Video

Built With

Share this project:

Updates