AegisHarness Project Story
Inspiration
AegisHarness was inspired by a problem we kept seeing in AI-assisted coding: agents can generate code quickly, but they often skip the careful engineering steps that make a change safe to merge. A vague request can turn into a broad patch, security risks can be missed, and generated code may never be tested in a realistic review loop.
We wanted to build a system that treats AI coding less like autocomplete and more like a disciplined engineering workflow.
What it does
AegisHarness is an agentic compiler and guardrail console for safer AI coding. It takes a natural-language request, searches for relevant open-source context, rewrites the request into a structured engineering brief, generates a preflight bug list, pauses for human approval, routes the task to an appropriate model, and then runs the result through a GitHub sandbox with automated review and repair.
The workflow is organized into five phases:
- Intent parsing and context building
- Preflight bug-list generation
- Human-in-the-loop approval
- Compute routing and code generation
- GitHub sandbox review and repair
The core idea is that an AI coding task should not move directly from prompt to patch. It should pass through explicit constraints, reviewable requirements, and bounded feedback loops. In the project, the repair loop is capped so the system cannot iterate forever:
$$ \text{max_repair_attempts} = 3 $$
How we built it
We built the frontend with React, TypeScript, and Vite. The interface acts as a workflow console where users can enter a request, watch each phase progress, inspect the generated agent brief, review negative constraints, and approve or reject execution.
The backend uses Python, FastAPI, SQLAlchemy, and a state-machine architecture to persist task state and coordinate the workflow. We integrated AI provider routing so the system can prefer Gemini when available and fall back to Clod.io when needed. We also connected the workflow to GitHub, Greptile, and repository-search services so generated work can be grounded in real project context and reviewed in a sandbox loop.
Challenges we ran into
One of the biggest challenges was designing the state machine clearly. The system needed to feel agentic, but still predictable. For example, PENDING_APPROVAL is mandatory before execution, and FINISHED only happens after the sandbox passes. That helped keep the workflow auditable instead of letting the agent silently continue through risky steps.
Another challenge was translating vague user intent into useful engineering constraints. It was not enough to simply pass the user's request into a model. We needed to retrieve context, identify likely failure modes, generate negative constraints, and present everything in a way that a human could review before execution.
Accomplishments that we're proud of
We are proud that AegisHarness turns AI coding into a structured, inspectable workflow instead of a single prompt-response interaction. The human approval step, preflight risk generation, provider routing, and bounded repair loop all work together to make the system more trustworthy.
We are also proud of the full-stack implementation: a React workflow console, a Python backend, persistent task state, AI service integrations, and a testable state-machine design.
What we learned
We learned that safer AI coding depends on structure. The model is only one part of the system; the surrounding workflow matters just as much. Clear phases, human checkpoints, retrieved context, negative constraints, bounded retries, and testable outputs all make the final result more reliable.
We also learned how important it is to separate agent autonomy from agent accountability. AegisHarness can automate parts of the coding process, but it still exposes the reasoning, risks, and approval points that engineers need.
What's next for AegisHarness
Next, we want to expand AegisHarness with deeper GitHub sandbox automation, richer repository analysis, stronger review feedback, and more detailed observability for each workflow phase.
We also want to improve the user experience around editing the generated brief, comparing repair attempts, and tracking why a model or route was selected. Over time, AegisHarness could become a full guardrail layer for teams using AI agents in real software projects.
Built With
- clod.io-api
- conda
- fastapi
- gemini-api
- github-api
- greptile-api
- lucide-react
- nia-api
- pytest
- python
- react
- sqlalchemy
- trynia-api
- typescript
- uvicorn
- vite
Log in or sign up for Devpost to join the conversation.