Inspiration
In the fast-paced world of mobile development, security is often treated as a "final check" rather than a continuous process. As a cybersecurity researcher in the Philippines, I’ve seen countless projects delayed—or worse, breached—because manual penetration testing couldn't keep up with CI/CD speeds. I wanted to build an agent that doesn't just "scan" code, but challenges it. I was inspired by the idea of an "AI Red-Teamer" that lives inside GitLab, providing immediate, adversarial feedback the moment a developer pushes code.
What it does
Cyber-Ifrit: The Shadow Auditor is an autonomous security agent that bridges the gap between static analysis and active exploitation.
Adversarial Analysis: It monitors Merge Requests for high-risk patterns (Insecure API endpoints, hardcoded secrets, or logic flaws).
Proof of Concept (PoC) Generation: Instead of just flagging a line of code, it spins up a temporary Google Cloud Sandbox and attempts to exploit the vulnerability.
Automated Remediation: Once an exploit is verified, it uses Anthropic Claude 3.5 Sonnet to generate a secure patch and suggests the fix directly in the GitLab MR, closing the loop between discovery and defense.
How we built it
The project is built on the GitLab Duo Agent Platform using a sophisticated multi-agent flow:
The Scout (GitLab Duo Flow): Triggers on MR events to identify potential attack vectors.
The Orchestrator (Google Cloud Vertex AI): Manages ephemeral Google Cloud Run environments to host "victim" instances of the code.
The Attacker (Python/Rust): A library of custom security scripts executed by the agent to perform dynamic analysis.
The Guardian (Anthropic): Processes the exploit results to provide human-readable explanations and code-level fixes.
Challenges we ran into
The primary challenge was "Deterministic Exploitation." AI can sometimes hallucinate vulnerabilities. To solve this, we implemented a "Verification Gate"—the agent is not allowed to report a critical bug unless it can successfully execute a PoC in the Google Cloud sandbox. Orchestrating these three-way communications between GitLab, Google Cloud, and the AI models within the 6-day hackathon window required intense focus on API latency and state management.
Accomplishments that we're proud of
We successfully built a system where a developer can see a "Hacked!" comment on their code within minutes of pushing. Successfully integrating Google Cloud's infrastructure to act as a "firing range" for the GitLab agent was a massive technical win. We are also proud of the low false-positive rate achieved by requiring "Proof of Concept" verification.
We are particularly proud of our Recursive Learning Loop, which allows the agent to persist 'Lessons Learned' across pipeline runs. This means if Ifrit fails to exploit a target in one cycle, it autonomously adapts its strategy for the next run—creating a truly sentient security intelligence.
What we learned
We learned that the true power of the GitLab Duo Agent Platform isn't just in code completion, but in External Orchestration. By allowing agents to interact with cloud environments and real security tools, we move from "Chatbots" to "Action-Bots." We also gained deep insights into how Anthropic's reasoning capabilities can be used to translate complex security exploits into educational feedback for developers.
What's next for Cyber-Ifrit: The Shadow Auditor
This is just the beginning for Cyber-Ifrit. Our roadmap includes:
Support for Flutter/Mobile Native: Deepening the agent's ability to detect mobile-specific binary vulnerabilities.
Persistent Red-Teaming: Moving beyond MRs to perform scheduled "Chaos Security" tests on production-like environments.
Enterprise Integration: Expanding the "Shadow Auditor" to help small-to-medium businesses in the Philippines secure their POS and fintech applications.
Log in or sign up for Devpost to join the conversation.