-
-
CLI execution flow: crash reproduced, patch generated, and fix verified inside Docker. Status: SUCCESS.
-
Full-file patch returned by the LLM after passing AST parsing, truncation detection, and structural validation gates.
-
Human-readable audit artifact detailing patch impact, modified lines, and verification status
Inspiration
AI systems can generate code fixes rapidly, but they cannot prove those fixes are safe. During experimentation with LLM-based patching, I observed recurring issues: hallucinated imports, truncated files, silent regressions, and uncontrolled refactors.
The problem was not patch generation — it was enforcement.
There is a missing safety layer between AI-generated fixes and production acceptance. OpsGuard was built to close that trust gap through deterministic verification.
What it does
OpsGuard is a zero-trust AI remediation engine exposed through a CLI interface.
It:
- Clones a repository into an isolated workspace
- Reproduces application crashes inside Docker
- Classifies failures (code vs infrastructure)
- Generates a minimal LLM-assisted patch
- Validates structural integrity using AST parsing and heuristics
- Compiles the patch
- Re-runs the application or test suite inside Docker
- Accepts the fix only if
exit_code == 0
If verification fails, the patch is rejected.
No verification → No acceptance.
How we built it
OpsGuard is structured as a deterministic orchestration engine using:
- LangGraph for state-machine-based workflow control
- Docker (python:3.11-slim) as the runtime trust boundary
- AST parsing + structural heuristics to validate LLM output
- Bounded retry logic (2 reproduction retries, 3 fix retries)
- Workspace isolation to prevent repository mutation
- Unified diff generation for auditability
LLM providers:
- NVIDIA NIM (Llama 3.1 70B) — primary
- Groq (Llama 3.3 70B) — fallback
The LLM proposes full-file patches only. It cannot execute code, control workflow state, or bypass validation. All enforcement is handled by the orchestrator.
Challenges we ran into
The primary challenge was handling probabilistic LLM output within a deterministic system.
LLMs frequently:
- Returned truncated files
- Removed unrelated logic
- Added unsafe imports
- Produced syntactically valid but structurally damaging patches
To solve this, OpsGuard enforces layered validation gates including AST parsing, symbol preservation heuristics, syntax compilation, and Docker runtime verification.
Another challenge was balancing automation with safety. Instead of retrying indefinitely, the system enforces bounded retries and fails safely when limits are reached.
Accomplishments that we're proud of
- Designed a zero-trust AI enforcement architecture
- Integrated deterministic Docker verification into the patch lifecycle
- Built structural truncation detection for LLM output
- Implemented bounded retry logic with explicit state transitions
- Created structured audit artifacts including diffs and remediation reports
OpsGuard is not a patch generator — it is an enforcement engine.
What we learned
We learned that AI integration is not about generation — it is about control.
Deterministic systems must wrap probabilistic models with strict enforcement layers. Without verification gates, AI automation can introduce more risk than value.
True AI-assisted automation requires:
- Isolation
- Validation
- Runtime proof
- Fail-safe termination
What's next for OpsGuard — Zero-Trust AI Remediation
Future directions include:
- Multi-file patch validation
- CI/CD pipeline integration
- GitHub PR automation
- Coverage-aware regression enforcement
- Policy-driven validation rules
OpsGuard is designed to evolve into a CI-integrated remediation controller that validates AI-generated fixes before they ever reach production.
Log in or sign up for Devpost to join the conversation.