Inspiration
It started with a conversation with my cousin, who works as a Site Reliability Engineer (SRE). I kept hearing stories about the dreaded "2 AM pager"—the chaotic, exhausting reality of being woken up in the middle of the night just to restart a pod, patch a memory leak, or revert a broken deployment.
I wanted to know if this was just their company or an industry-wide nightmare, so I took to Reddit and DevOps communities. After speaking directly with over 15 active SREs and DevOps professionals, the consensus was clear: Mean Time to Recovery (MTTR) is largely bottlenecked by the time it takes a human to wake up, read logs, and understand the context. I built Praxis to be the ultimate teammate: an autonomous AI SRE that wakes up at 2 AM, diagnoses the pipeline failure, writes the patch, and has a Merge Request waiting for your approval before your coffee is even brewed.
How I built it
Praxis is powered by a multi-agent AI architecture, designed to act exactly like a senior engineering team.
- The Brain (LLM Orchestration): I used Gemini 2.5 Pro as my core coordinator agent, with Gemini 2.5 Flash handling rapid sub-tasks.
- The Governance Board: When a patch is generated, it doesn't just push to production blindly. I built a concurrent evaluation pipeline where specialized AI "Critics" (Security, FinOps, and Architecture) evaluate the patch. If the FinOps agent realizes the patch will cause runaway API costs, it vetoes the code and forces a rewrite.
- The QA Agent: Once the Governance Board passes the patch, another agent automatically writes Playwright regression tests to ensure the bug never happens again.
- Alert Tuning: I utilized Pinecone as a vector database for my alert memory, giving Praxis a 2-hour TTL eviction cycle to prevent noisy, duplicate alerts.
- The Human-in-the-Loop: I integrated directly with Slack's API and GitLab. Praxis sends the diagnostic summary and the patch to a Slack channel. With one click of the "Approve & Merge" button, the webhook triggers the GitLab pipeline.
Challenges I ran into
Building a fully autonomous system that writes and commits real code is terrifying, and I ran into several massive architectural hurdles:
- The "Split-Brain" Webhook Bug: Managing state between my local AI environment and my cloud production webhooks caused my Slack interactive buttons to fail. I had to implement secure tunneling and state management to ensure the patch approved in Slack matched the exact payload sitting in my server's memory.
- Context Window Exhaustion: Feeding thousands of lines of telemetry, logs, and Git history into the model simultaneously initially caused hallucinations. I solved this by creating the multi-agent system, breaking the problem down so each Gemini agent only looked at the specific context it needed.
Accomplishments that I'm proud of
In enterprise architecture, system availability is defined mathematically: $$\text{Availability} = \frac{\text{MTBF}}{\text{MTBF} + \text{MTTR}}$$ By completely automating the triage, root-cause analysis, and patching phases, Praxis effectively shrinks the MTTR (Mean Time to Recovery) denominator down to the time it takes a human to click "Approve."
I am incredibly proud of successfully orchestrating multiple AI agents to debate each other (The Governance Board) and arrive at a secure, cost-effective code patch without any human hand-holding.
What I learned
I learned that the future of AI in software engineering isn't about replacing developers; it's about shifting developers from being "writers" to "reviewers." I also learned the hard way about the complexities of Slack's interactive webhook payloads and how to properly tune a Pinecone vector database for short-term memory eviction.
What's next for Praxis
- Auto-Rollbacks: Integrating with Datadog to allow Praxis to autonomously revert Git commits if it detects a spike in 500 errors immediately after a deployment.
- Runbook Ingestion: Allowing enterprise teams to upload their company's custom PDF/Markdown runbooks so Praxis can follow strict internal compliance steps during an incident.
Built With
- and-deployments-managed-across-vercel-and-render
- and-typescript
- and-vertex-ai
- gemini
- gemini-2.5-flash
- gitlab
- google-gemini)
- ngrok
- ngrok-for-secure-webhook-tunneling
- node.js
- pinecone
- playwright
- playwright-for-automated-testing
- powered-by-a-multi-agent-ai-architecture-utilizing-gemini-2.5-pro
- react
- render
- slack
- slack-api)
- typescript
- vercel
- with-pinecone-for-vector-memory
Log in or sign up for Devpost to join the conversation.