OpsGuard — Zero-Trust AI Remediation

CLI execution flow: crash reproduced, patch generated, and fix verified inside Docker. Status: SUCCESS.
Full-file patch returned by the LLM after passing AST parsing, truncation detection, and structural validation gates.
Human-readable audit artifact detailing patch impact, modified lines, and verification status

Inspiration

AI systems can generate code fixes rapidly, but they cannot prove those fixes are safe. During experimentation with LLM-based patching, I observed recurring issues: hallucinated imports, truncated files, silent regressions, and uncontrolled refactors.

The problem was not patch generation — it was enforcement.

There is a missing safety layer between AI-generated fixes and production acceptance. OpsGuard was built to close that trust gap through deterministic verification.

What it does

OpsGuard is a zero-trust AI remediation engine exposed through a CLI interface.

It:

Clones a repository into an isolated workspace
Reproduces application crashes inside Docker
Classifies failures (code vs infrastructure)
Generates a minimal LLM-assisted patch
Validates structural integrity using AST parsing and heuristics
Compiles the patch
Re-runs the application or test suite inside Docker
Accepts the fix only if exit_code == 0

If verification fails, the patch is rejected.

No verification → No acceptance.

How we built it

OpsGuard is structured as a deterministic orchestration engine using:

LangGraph for state-machine-based workflow control
Docker (python:3.11-slim) as the runtime trust boundary
AST parsing + structural heuristics to validate LLM output
Bounded retry logic (2 reproduction retries, 3 fix retries)
Workspace isolation to prevent repository mutation
Unified diff generation for auditability

LLM providers:

NVIDIA NIM (Llama 3.1 70B) — primary
Groq (Llama 3.3 70B) — fallback

The LLM proposes full-file patches only. It cannot execute code, control workflow state, or bypass validation. All enforcement is handled by the orchestrator.

Challenges we ran into

The primary challenge was handling probabilistic LLM output within a deterministic system.

LLMs frequently:

Returned truncated files
Removed unrelated logic
Added unsafe imports
Produced syntactically valid but structurally damaging patches

To solve this, OpsGuard enforces layered validation gates including AST parsing, symbol preservation heuristics, syntax compilation, and Docker runtime verification.

Another challenge was balancing automation with safety. Instead of retrying indefinitely, the system enforces bounded retries and fails safely when limits are reached.

Accomplishments that we're proud of

Designed a zero-trust AI enforcement architecture
Integrated deterministic Docker verification into the patch lifecycle
Built structural truncation detection for LLM output
Implemented bounded retry logic with explicit state transitions
Created structured audit artifacts including diffs and remediation reports

OpsGuard is not a patch generator — it is an enforcement engine.

What we learned

We learned that AI integration is not about generation — it is about control.

Deterministic systems must wrap probabilistic models with strict enforcement layers. Without verification gates, AI automation can introduce more risk than value.

True AI-assisted automation requires: