PHOENIX

flow start
phoenix-MR
flow_1
flow_2

Inspiration

AI coding assistants are writing more code than ever, The problem? They're also introducing bugs faster than developers can catch them. A team might fix one null-check vulnerability in a PR review, only to discover the same pattern lurking in twelve other files and then they watch it reappear next month when someone copies an old snippet. We kept asking ourselves: what if fixing one bug could immunize the entire codebase against that class of vulnerability? just like a vaccine ?

So we built PHOENIX , like a Phoenix rises from its ashes and becomes tougher, the codebase also evolves and becomes immune to certain patterns of bugs that are there.

What it does

PHOENIX is a 7-agent pipeline that transforms a single failing test into permanent protection. When a pipeline breaks, Triage investigates and classifies the failure. Surgeon patches the immediate problem. Pathologist abstracts the bug into a searchable anti-pattern. Hunter scours the codebase for every sibling vulnerability. Immunizer fixes all of them and generates a Semgrep rule. Arbiter validates everything and opens a merge request. Guardian then stands watch on future PRs, blocking any attempt to reintroduce the same vulnerability class.

The core agentic flow as follows : Triage -> Surgeon -> Pathologist -> Hunter -> Immunizer -> Arbitrer -> Infra Reporter

Phoenix Guardian finds the recurring bug patterns in the MRs and then automatically detects and warns the developer.

How we built it

We built PHOENIX on the GitLab Duo Agent Platform. We used Bun and TypeScript for a simple API to test the flow of the agent.

Each agent has a distinct responsibility and a curated set of GitLab-native tools: 30 unique tools across the system. The pipeline uses conditional routing: infrastructure failures get reported and halted, while code bugs flow through the full immunization chain. Semgrep handles static analysis and rule generation. The whole thing runs against a Fastify API we intentionally seeded with a vulnerable pattern for demonstration.

Challenges we ran into

Agent orchestration is harder than it looks. Getting seven specialized agents to pass context cleanly without bloating prompts or losing critical details required careful schema design. We also hit friction with cross-project search availability depending on GitLab tier, which forced us to make Hunter gracefully degrade when Advanced Search isn't present. Validating generated Semgrep rules before committing them without breaking CI took more iteration than expected.

We also wanted to integrate slack in here, but that proved very difficult as well.

Accomplishments that we're proud of

The variant analysis pipeline actually works. Fix one bug, and PHOENIX genuinely finds and patches siblings across the codebase. The Semgrep rule generation means we're not just cleaning up today's mess, we're preventing tomorrow's. The conditional routing between code bugs and infrastructure failures keeps the system honest instead of hallucinating fixes for problems that don't exist in source code.

What we learned

Multi-agent systems need sharp boundaries. Letting each agent do one thing well and explicitly defining what it hands off to the next prevented the kind of scope creep that turns orchestration into chaos. We also learned that the "fix" is only half the value; the "prevent" is what actually changes the security posture long-term.

Also to set up custom agents and flows using GitLab Duo's native platform

What's next for PHOENIX

Cross-project immunization at scale. When Hunter finds a vulnerability pattern in one repository, there's no reason the same rule can't propagate to every project in a group. We're also looking at feedback loops and tracking which Semgrep rules block the most violations over time and surfacing that data to teams so they understand where their codebase keeps trying to regress.

Eventually we can move to integrate third party apps with GitLab

Built With

bun
gitlab
gitlab-duo
typescript
yml

Submitted to

GitLab AI Hackathon

Created by

I drove the core architecture, orchestration logic, and platform integration for PHOENIX. Specifically, I led the transition of our conceptual multi-agent ideas into executable reality on the GitLab Duo Agent Platform.

Hetansh Waghela
I worked of fixing the bugs and rendered the end video

Nikhil Pravin Pise
I worked on improving the flow, testing PHOENIX and the video pitch

Vraj Ved
Polymath
Darshan Ved

Updates

Vraj Ved started this project — Mar 25, 2026 01:37 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.