Inspiration

Security teams don’t have a “lack of data” problem — they have a too much data, not enough clarity problem.

Every scan produces a pile of artifacts: SARIF findings, SBOMs, package locks, scanner logs… and then humans spend hours doing the same painful work:

  • gather files from different places,
  • reconcile formats and naming,
  • decide what’s real,
  • and translate it into something a team can act on.

Argonaut came from wanting that messy middle to stop being a hand-wavy ritual. The goal is to make it boring, repeatable, and provable and to do it in a way that feels “agentic” without being unsafe.

Argonaut is NOT “an AI that fixes your security.” It’s an agent-style workflow that reliably turns security noise into a clean, queryable run, then narrows the noise down to a small set of things. Which a human can confidently act on!


What it does

Argonaut is an agentic auto-triage workflow built on Elasticsearch + Kibana + Elastic Agent Builder, with Slack as the notification layer.

A new scan is acquired, the agent runs a staged pipeline, and every step leaves evidence in Elasticsearch (run header + task logs + outputs). The UI and Kibana dashboards are just views on top of that same data.

Pipeline (current): 1) Acquire: ingest bundle inputs deterministically ©rrently via object store drop. API Push or MCP to common security tools in future to acquire autonomously), validate structure + hashes 2) Normalize: convert artifacts into canonical document shapes (stable IDs, stable ordering) 3) Write: persist run state + normalized docs into Elasticsearch in a fixed, predictable order 4) Triage: compute reachability/fixability slices (the “800 → 50 → 7” moment) 5) Notify: post to Slack when scan received and when fix bundle is ready 6) Fix Bundle (safe handoff): Use AI to generate a developer-ready fix artifact and hand it off securely

Never auto-applied. It’s a reviewed handoff and not a silent change.

The key idea ids that “agent did this” isn’t a claim but an actual evidence trail. Runs, stages, and task traces are searchable in ES and explorable in Kibana.


Before vs After (User Journey)

Before

  • A scan finishes → 800–2000 findings
  • Results get triaged in multiple tools and copied into docs/spreadsheets/Slack threads
  • People debate reachability and priority in meetings with partial context
  • “Fix it” becomes a vague backlog item with broken links and missing evidence
  • There’s no clean answer to: what changed since last run, and what should we do next?

After (Argonaut)

  • A new bundle arrives → Slack posts Scan received
  • The agent runs the pipeline and records a run timeline in Elasticsearch
  • In the console, you see 7 reachable and not a scary pile of raw findings
  • You click into the run to see the stage-by-stage pipeline trail, then into Findings
  • You select the reachable and generate a fix bundle (again: not applied — uploaded for dev review in a secure location)
  • Slack posts Fix bundle ready to the developer, with a link + run reference
  • Kibana dashboards back it up: run health, task logs, and results — all queryable
  • Agent helper available in console to drive clarity and understanding of each step

How I built it

  • Agent-style workflow, evidence-first: every run has a run header and a task stream so you can explain what happened without guessing.
  • Deterministic data plane: stable IDs, stable ordering, idempotent writes so reruns don’t create drift or duplicates.
  • Elasticsearch as the source of truth: runs, tasks, findings, and outputs live as ES documents.
  • Kibana dashboards: not screenshots but real saved views on top of the indices.
  • Slack integration: two hard signals in the demo: scan received & fix bundle ready
  • Thin console UX: Simple dashboard to run pipeline view, ask Agent about findings and action.

Challenges I ran into

  • Determinism is deceptively hard. File ordering, retries, timestamps, partial failures are all easy to introduce drift if you don’t lock it down.
  • Idempotency vs “agent runs”: you want reruns, but you don’t want duplicate actions or messy state.
  • Making the agent story credible: the solution was treating run headers + task logs as first-class ES documents and building everything around that.
  • Keeping actioning safe: generating fix artifacts is useful, but auto-applying changes is risky. The demo needed a clean “handoff, not autopilot” contract.

Accomplishments that I am proud of

  • End-to-end flow that feels real: scan in → triage → 7 reachable → fix bundle → Slack handoff
  • “Agent did this” is visible and provable in Kibana (runs + task logs + outputs)
  • Deterministic ingestion and idempotent writes that hold up across reruns
  • A demo path that’s simple enough to follow, but still shows real ES + Kibana value

What I learned

  • Trust comes from repeatability more than cleverness.
  • Agent workflows only work if the evidence trail is structured and queryable.
  • Once the data plane contracts are stable, UX becomes dramatically easier as every screen turns into “just a query.”

What’s next for Argonaut

  • Expand artifact coverage and edge-case handling (more tools, more bundle variants)
  • Make enrichment stages pluggable and gated so “smart” never breaks determinism
  • Better multi-run comparisons: what changed since the last build?
  • Stronger actioning loop: fix bundles → PR templates → ticket integrations (still auditable and idempotent)
  • More Kibana views for operators: drift detection, run quality, and triage trendlines
Share this project:

Updates