Inspiration

AI has already transformed software creation by turning language into code, but most systems still need constant human supervision.

Our goal was to push past that limit: run long-horizon autonomous coding with many agents in parallel, while keeping them aligned to one objective.

Hole_In_One is our orchestration architecture for managing long-running agent swarms that can plan, build, test, review, and ship software end to end.

Problem

Most coding-agent systems today are iterative loops: either a human supervises each step, or one agent runs alone for long stretches. That works for small tasks, but long runs become brittle, lose context, and drift.

“Subagents” often help reasoning but not true execution, since they are usually just extra model calls with partial context. Even systems with isolated sandboxes still struggle at scale (100k+ LOC, thousands of commits) because global coherence, shared state, and quality enforcement break down over time.

What It Does:

  • Drives Cursor cloud agents from a CLI with hole-in-one goals or multi-step plans via CLōD.
  • Splits plans into builder PRs created from Cursor Cloud Agents HTTP API
  • Runs a Greptile review pass on every PR to catch risky patterns before merge.
  • Optionally runs fix rounds and a CLōD second validation pass after review.
  • Handles merge flows automatically — GitHub auto-merge, REST fallback, and polled wait-between-steps.
  • Streams live build state (plan, current step, agent status) to a Next.js dashboard via a FastAPI bridge.
  • Dashboard uses a Cursor-inspired dark UI (zinc + violet) with live polling.
  • Fully env-driven tuning for merge polling intervals, planner token limits, and more.

How We Built It

Hole In One is a modular Python + web system split across orchestration, review, merge control, and visualization.

Backend Agent

  • src/hole_in_one/orchestrate.py: main runtime loop (builder PR creation, optional subagent fanout, Greptile/GrepLite review wait, fix rounds, merge, continuous cycles).
  • src/hole_in_one/cursor_api.py: Cursor Cloud Agents HTTP integration.
  • src/hole_in_one/github_api.py: GitHub PR/check/review polling plus GraphQL auto-merge and REST merge fallback.
  • src/hole_in_one/clod_api.py: optional CLōD planner, Greptile feedback compression, and second-pass validator.
  • src/hole_in_one/dashboard_store.py + dashboard_api.py: live in-memory state + FastAPI snapshot endpoints.

web/ (Next.js) frontend dashboard for live and mock orchestration views. AI & Agent Logic The runtime is staged and throughput-focused rather than a single-agent loop:

A builder agent opens/updates a PR.

  • Optional CLōD planning can split one high-level goal into sequential builder tasks.
  • Optional workstream decomposition fans a PR into parallel implementation subagents on the same branch.
  • Greptile/GrepLite review is polled via GitHub checks/reviews/comments before merge decisions.
  • Fix rounds spawn fresh PR-scoped fix agents (single or parallel chunked fixers).
  • Optional CLōD second validator reviews Greptile summary + unified diffs and returns VERDICT: PASS/FAIL.
  • Merge flow supports queued auto-merge, immediate REST merge, and continuous post-merge cycling. Infrastructure
  • Python 3.11 CLI with httpx + .env-driven controls.
  • Cursor Cloud runs agents; GitHub is the shared source of PR/check/merge truth.
  • Parallelism is controlled with bounded workers (MAX_PARALLEL_WORKSTREAMS, MAX_PARALLEL_FIXERS).

FastAPI + Uvicorn expose /api/dashboard/snapshot and /api/dashboard/health for live UI polling. Interface Full-stack dashboard: Next.js frontend + FastAPI backend snapshot bridge. Tabs: Agent Grid, Activity, Graph (including tree/graph child-agent visibility). Legacy Textual/Rich terminal UI still exists, but web is the primary interface.

Challenges

Auto-merge permissions/branch rules can fail in practice, so REST fallback paths were necessary.

Accomplishments

Shipped a working autonomous PR loop: plan → implement → review → fix → merge. Added pre-merge Greptile/GrepLite review and optional second-pass validation. Built and tested a real full-stack web app workflow with this architecture. Delivered a live ops dashboard for real-time orchestration visibility.

What We Learned

Clear task slicing plus explicit fallback logic is more important than raw agent count.

What’s Next

Better throughput and cost tuning for long continuous runs.

Built With

  • clod
  • cursor
  • fastapi
  • greptile
  • nextjs
  • openai
  • python
Share this project:

Updates