Inspiration
Production failures don’t usually happen because engineers don’t care — they happen because follow-ups get delayed.
We were inspired by real-world incidents like large-scale AWS service outages, Cloudflare edge failures, and react2shell-style supply-chain vulnerabilities, where early signals existed but fixes took hours or days to land. Logs were noisy, vulnerability feeds were overwhelming, and translating an incident into a safe code change was still a manual process.
What if production incidents automatically turned into pull requests?
What it does
DevWatch AI continuously monitors production systems for runtime errors, health regressions, and dependency vulnerabilities. When an issue is detected, it:
- Analyzes the root cause using DigitalOcean Gradient AI, grounded in the project’s codebase
- Generates a fix using OpenAI GPT-OSS 120B via Gradient Serverless Inference
- Opens a detailed GitHub pull request, with optional auto-merge when safety rules are met
This turns incidents into fixes in minutes instead of hours.
How we built it
DevWatch AI runs entirely on DigitalOcean:
- DO Functions run scheduled monitors for logs, health checks, and CVEs
- Logs and artifacts are stored in DigitalOcean Spaces
- Code is synced into a Gradient Knowledge Base via GitHub Actions
- A background Event Processor calls:
- Gradient Agent API for root-cause analysis and fix strategy
- Gradient Serverless Inference (GPT-OSS 120B) for code generation
- Gradient Agent API for root-cause analysis and fix strategy
- Fixes, events, and audits are stored in DO Managed PostgreSQL
- PRs are created and optionally auto-merged via the GitHub API
A real-time Next.js dashboard shows system health, events, and generated fixes.
Challenges we faced
- Preventing unsafe auto-merges required confidence scoring, diff limits, and restricted file rules
- Raw logs were too large for AI analysis, so we built structured event extraction
- Keeping fixes deterministic for live demos required controlled failure scenarios
- Designing AI workflows that assist — not replace — human review
What we learned
- AI is most effective when constrained by guardrails
- Knowledge-base grounding dramatically improves reliability
- Automation should optimize incident follow-ups, not bypass engineers
- Production tooling must be explainable to earn trust
What’s next
- Multi-language support (Python, Java, Go)
- Distributed tracing integration
- Multi-repo monitoring
- Slack and CI/CD integrations
- Advanced incident analytics
Built With
- Languages: JavaScript, TypeScript
- Frameworks: Node.js, Next.js
- Cloud Platform: DigitalOcean
- AI: DigitalOcean Gradient AI Platform, OpenAI GPT-OSS 120B
- Databases: DigitalOcean Managed PostgreSQL
- Storage: DigitalOcean Spaces
- APIs: GitHub API, NVD, npm advisories
Try it out
- GitHub Repository: https://github.com/your-username/devwatch-ai
- Live Dashboard (Demo): http://localhost:3000
- Demo App: http://localhost:3001
Video Demo
Add your demo video link here.
Built With
- digitalocean
- do-agent
- do-function
- do-knowledge-base
- do-manged-db
- do-serverless-inference
- github
- github-functions
- gradient-ai
- nextjs
- postgresql
- worklows
Log in or sign up for Devpost to join the conversation.