AegisHarness Project Story

Inspiration

AegisHarness was inspired by a problem we kept seeing in AI-assisted coding: agents can generate code quickly, but they often skip the careful engineering steps that make a change safe to merge. A vague request can turn into a broad patch, security risks can be missed, and generated code may never be tested in a realistic review loop.

We wanted to build a system that treats AI coding less like autocomplete and more like a disciplined engineering workflow.

What it does

AegisHarness is an agentic compiler and guardrail console for safer AI coding. It takes a natural-language request, searches for relevant open-source context, rewrites the request into a structured engineering brief, generates a preflight bug list, pauses for human approval, routes the task to an appropriate model, and then runs the result through a GitHub sandbox with automated review and repair.

The workflow is organized into five phases:

Intent parsing and context building
Preflight bug-list generation
Human-in-the-loop approval
Compute routing and code generation
GitHub sandbox review and repair

The core idea is that an AI coding task should not move directly from prompt to patch. It should pass through explicit constraints, reviewable requirements, and bounded feedback loops. In the project, the repair loop is capped so the system cannot iterate forever:

$$ \text{max_repair_attempts} = 3 $$

How we built it

We built the frontend with React, TypeScript, and Vite. The interface acts as a workflow console where users can enter a request, watch each phase progress, inspect the generated agent brief, review negative constraints, and approve or reject execution.

The backend uses Python, FastAPI, SQLAlchemy, and a state-machine architecture to persist task state and coordinate the workflow. We integrated AI provider routing so the system can prefer Gemini when available and fall back to Clod.io when needed. We also connected the workflow to GitHub, Greptile, and repository-search services so generated work can be grounded in real project context and reviewed in a sandbox loop.

Challenges we ran into

One of the biggest challenges was designing the state machine clearly. The system needed to feel agentic, but still predictable. For example, PENDING_APPROVAL is mandatory before execution, and FINISHED only happens after the sandbox passes. That helped keep the workflow auditable instead of letting the agent silently continue through risky steps.

Another challenge was translating vague user intent into useful engineering constraints. It was not enough to simply pass the user's request into a model. We needed to retrieve context, identify likely failure modes, generate negative constraints, and present everything in a way that a human could review before execution.

Accomplishments that we're proud of

We are proud that AegisHarness turns AI coding into a structured, inspectable workflow instead of a single prompt-response interaction. The human approval step, preflight risk generation, provider routing, and bounded repair loop all work together to make the system more trustworthy.

We are also proud of the full-stack implementation: a React workflow console, a Python backend, persistent task state, AI service integrations, and a testable state-machine design.

What we learned

We learned that safer AI coding depends on structure. The model is only one part of the system; the surrounding workflow matters just as much. Clear phases, human checkpoints, retrieved context, negative constraints, bounded retries, and testable outputs all make the final result more reliable.

We also learned how important it is to separate agent autonomy from agent accountability. AegisHarness can automate parts of the coding process, but it still exposes the reasoning, risks, and approval points that engineers need.

What's next for AegisHarness

Next, we want to expand AegisHarness with deeper GitHub sandbox automation, richer repository analysis, stronger review feedback, and more detailed observability for each workflow phase.

We also want to improve the user experience around editing the generated brief, comparing repair attempts, and tracking why a model or route was selected. Over time, AegisHarness could become a full guardrail layer for teams using AI agents in real software projects.

Built With

clod.io-api
conda
fastapi
gemini-api
github-api
greptile-api
lucide-react
nia-api
pytest
python
react
sqlalchemy
trynia-api
typescript
uvicorn
vite

Updates

SebastianZzzz Zhu started this project — May 10, 2026 08:58 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.