## Inspiration

Field-robot fleets fail silently. Stuck mowers, battery drift, torque loss, boundary violations — by the time a human reads the Splunk alert, the SLA is already breached. We wanted observability that acts, not just alerts.

Cadence Cockpit is the third stage of Forenly's robot-fleet lifecycle: after Acquire (Lawn Advisor) recommends the right machine and Operate (FleetMind) deploys it, Sustain keeps it healthy in the field without paging a human.

## What it does

Cadence Cockpit is a closed-loop observability platform for deployed autonomous robot fleets. It:

  1. Ingests live telemetry (battery, motors, GPS / boundary, coverage, dock cycles, faults) into Splunk Cloud via HEC.
  2. Triages anomalies through a Gemini multi-agent orchestrator that uses the Splunk MCP Server as a guided root-cause query layer.
  3. Acts — reassigns coverage, reroutes around obstacles, schedules preventive maintenance, rebalances SLA load — and writes the outcome back to a live digital twin.

Three simulation channels run in parallel:

  • Operational Health — stuck robots, torque loss, boundary drift; coverage is reassigned and routes are corrected without human intervention.
  • Predictive Maintenance — battery / motor wear trends are surfaced before failure; service is scheduled during non-operational windows.
  • SLA & Coverage — contract-level commitments are monitored; fleet scheduling is rebalanced across sites when coverage drops below the contracted threshold.

## How we built it

  • Splunk Cloud (HEC + REST) for fleet telemetry ingest and query.
  • Splunk MCP Server as the agent's guided root-cause query layer — Gemini can only invoke curated SPL templates, never freeform SPL.
  • Gemini (GCP) as the multi-agent orchestrator that decides what to query and which corrective action to take.
  • Python standard library only — zero external dependencies, runs anywhere.
  • HTML / JavaScript Cockpit control room (dashboard.html) and fleet monitor (fleetmng.html).
  • Optional Slack human-in-the-loop escalation for actions outside the agent's autonomy envelope.
  • Deterministic fallback digital twin so demos and offline runs still produce coherent cognition if Gemini is unreachable.

Repo layout:

  • server.py — HTTP bridge between UI, Splunk HEC, and the orchestrator.
  • agent/app.py — Gemini decision engine.
  • agent/splunk_mcp_server.py — MCP query interface.
  • scripts/seed_telemetry.py — sample data injection for demos.

## Challenges we ran into

Closing the loop safely was the hard part. An agent that acts on production telemetry needs guardrails:

  • We wrapped Splunk access in an MCP layer so Gemini queries through curated SPL templates only — never freeform.
  • Every corrective action is dry-run against the digital twin before it is committed to the real fleet bus.
  • We kept the Python core dependency-free so the entire system can be audited in a single read pass — no hidden behavior in a transitive package.

## Accomplishments that we're proud of

  • A real Splunk integration (HEC ingest + REST queries), not a mock.
  • A live digital twin that updates from telemetry and reflects every agent decision.
  • Closed-loop autonomy demonstrated across three independent failure modes.
  • Zero external Python dependencies — the entire agent runs on the standard library.

## What we learned

  • MCP is a great fit for giving an LLM "guarded" access to an observability backend — you keep the agent intelligent without giving it root.
  • A deterministic fallback twin is worth building even when the live model works: it turns the demo from "hope the API is up" into "always green."
  • Closed-loop observability changes the operator's job — alerts become audit trails of actions already taken, not tasks to be done.

## What's next

Tighter integration with the upstream Forenly stack — Acquire (Lawn Advisor) and Operate (FleetMind) — so Cadence Cockpit becomes the production Sustain layer for the full robot-fleet lifecycle, with autonomy envelopes tuned per customer SLA.

Built with

splunk-cloud, splunk-mcp-server, gemini, gcp, python, html, javascript, slack, mcp, digital-twin

Built With

Share this project:

Updates