DevProd

Architecture
DevProd frontend run 2
DevProd frontend run 3
DevProd frontend run 1

Inspiration

Software incidents are still handled with too much context switching. Engineers jump between alerts, dashboards, commits, runbooks, architecture docs, and old postmortems while trying to reconstruct what happened under pressure. I wanted to build something that reflects where software engineering is going next: not just AI that writes code, but AI that helps supervise and improve real engineering workflows.

That led to DevProd, an incident-response control plane designed around bounded, reviewable agents. The goal was to create a system that can investigate incidents, surface evidence, retrieve the right operational knowledge, rank likely root causes, suggest remediation, and draft a postmortem, while still keeping a human in control.

What it does

DevProd is an AI-powered incident investigation workflow for software teams.

Given an incident, it can:

classify the issue and investigation path
collect and structure evidence from alerts and incident context
correlate likely causal changes
retrieve relevant runbooks, architecture notes, and prior incidents
rank root-cause hypotheses
recommend remediation steps
draft a postmortem
expose a reviewable workflow trace

It also includes a benchmark arena of synthetic engineering incidents with expected outcomes and rubrics, so the workflow can be tested against realistic scenarios instead of being treated like an unmeasured chatbot.

How we built it

We built DevProd as a small full-stack application with a clear separation between the user-facing control plane, the workflow orchestration layer, and the benchmark corpus.

Frontend

Next.js dashboard
incident inbox
investigation view
evidence, retrieval, hypotheses, remediation, and postmortem panels

Backend

FastAPI service
structured API routes for incident intake, investigation runs, retrieval, hypotheses, remediation, and postmortem outputs
local orchestration stub that reads benchmark scenarios, knowledge documents, and prompt bundles

Workflow and evaluation

specialized prompt roles for:
- triage
- evidence
- retrieval
- hypothesis
- remediation
- postmortem
- policy review
seeded benchmark scenarios in arena/scenarios
retrieval corpus in knowledge
shared response contracts in packages/contracts

DigitalOcean Gradient AI

The system is designed around DigitalOcean Gradient AI as the intended hosted AI layer for:

agent orchestration
inference
retrieval-backed workflows
evaluation and traces

To make the project reviewable under hackathon time constraints, I included:

a demo provider for local execution
a live provider integration path in the backend for DigitalOcean Gradient AI
a DigitalOcean App Platform deployment spec in .do/app.yaml

Challenges we ran into

The biggest challenge was time. I wanted the project to be more than a polished UI, so I spent time building the underlying system shape: contracts, scenarios, knowledge documents, evaluation artifacts, and a workflow that is inspectable rather than magical.

Another challenge was deployment. I prepared the project for DigitalOcean App Platform and set up a multi-service app spec, but I ran out of time before completing a final public deployment. Rather than fake that part, I kept the submission honest and focused on delivering a runnable local prototype with the deployment configuration included.

There was also a product-design challenge: keeping the workflow ambitious without making it look like an unsafe autonomous operator. That is why DevProd is structured around bounded agents, review steps, retrieval, and explicit traces.

Accomplishments that we're proud of

Built a full-stack incident-response application instead of only a prompt demo
Created a multi-agent workflow with distinct responsibilities
Added a benchmark arena with multiple realistic incident scenarios
Built a retrieval corpus of runbooks, architecture notes, incidents, and postmortems
Exposed investigation outputs through structured backend routes
Created a dashboard that surfaces evidence, hypotheses, remediation, and postmortem results
Included deployment config for DigitalOcean App Platform and a live-provider path for Gradient AI
Kept the system reviewable, bounded, and measurable

What we learned

I learned that the most useful AI systems for engineering are not the ones that try to act omniscient. They are the ones that make context legible, constrain behavior, and give humans better leverage during messy workflows.

I also learned how important benchmarkability is. Once you frame the product as a workflow rather than a chatbot, you naturally need scenarios, expected outcomes, rubrics, and traces. That changes how you design both the product and the codebase.

On the platform side, I learned a lot about shaping an app for DigitalOcean deployment, separating demo-mode behavior from live-provider behavior, and building toward a cloud-native AI architecture even when the final hosted deployment is still in progress.

What's next for DevProd

Next, I want to:

complete the public DigitalOcean App Platform deployment
connect the live workflow fully to DigitalOcean Gradient AI
expand the benchmark arena with more failure modes and distractor patterns
add richer trace and evaluation views in the dashboard
move run history from local SQLite to a persistent managed store
support reviewer feedback loops for workflow policy iteration
make DevProd usable as a real incident copilot for small engineering teams

The long-term vision is for DevProd to become a practical control plane for supervising, evaluating, and improving AI-assisted engineering operations.

Built With

digitalocean-app-platform
digitalocean-gradient-ai
docker
docker-compose
fastapi
github
json-schema
next.js
node.js
npm
pydantic
pytest
python
react
sqlite
typescript
vitest

Updates

Natasha N started this project — Mar 18, 2026 04:59 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.