Inspiration
The current paradigm of DevSecOps is broken when it comes to AI infrastructure. Traditional SAST and DAST tools parse syntax, but they cannot parse semantic intent or adversarial ML payloads like prompt injections, jailbreaks, or Byzantine data poisoning.
Defending next-generation AI pipelines requires more than regex filters or generic chatbots. It requires an intelligence capable of attacking the pipeline, and a foundational mathematical framework to prove those attacks fail. We built Project Brahma to shift AI security from "probabilistic guessing" to "mathematical certainty," bringing zero-trust, nation-state grade engineering to the GitLab CI/CD pipeline.
What it does
Project Brahma is an autonomous Red Team / Blue Team wargame embedded directly inside the GitLab Duo Agent ecosystem. When a developer opens a Merge Request (MR) that modifies an ML pipeline (e.g., updating prompt templates or RAG architecture), the GitLab Duo Agent triggers our custom DevSecOps flow:
The Red Team Attack: A Python microservice intercepts the MR context and uses Anthropic's Claude 3.5 Sonnet to dynamically generate a highly sophisticated attack matrix (data exfiltration, prompt overrides) tailored strictly to the new code.
The Blue Team Defense: A high-speed, memory-safe Rust engine ingests this attack matrix. Instead of using a secondary LLM to "guess" if the payload is malicious, it maps the application's execution state to a First-Order Logic matrix and uses the Z3 Satisfiability Modulo Theories (SMT) solver to mathematically prove whether the code is vulnerable.
Concrete Action: If the mathematical proof returns SAT (Vulnerable), the GitLab pipeline is instantly halted, and a verifiable failure proof is posted to the MR. If it returns UNSAT (Safe), the MR is allowed to proceed.
How we built it
The architecture is built on a dual-track microservice model designed to run in a Google Cloud Platform (GCP) Confidential Space enclave.
The Offensive Layer (Python + Anthropic): We utilized the GitLab Duo Agent Custom Flow (config.yaml) to capture pipeline events. A Python FastAPI orchestrator uses pydantic schemas and the Anthropic SDK to enforce strict JSON tool-calling, turning Claude 3.5 Sonnet into an autonomous adversarial payload generator.
The Defensive Layer (Rust + Z3 SMT): We built a custom formal verification engine in Rust. We model the ML application as a state transition system $M = \langle S, S_0, T \rangle$.
To prevent Data Exfiltration (Taint-Tracking), we mathematically prove that no sensitive variable $V_s$ can reach an external egress sink $E$. We feed the Z3 solver the negation of our safety property:
$$\exists a \in A, \exists v \in V_s \text{ such that } (a = v) \land (C \in E)$$
If the Z3 solver returns SAT, an attack path exists. If it returns UNSAT, the pipeline is mathematically proven to be secure.
To defend against Prompt Overrides, we map the core immutable instructions $I_{\text{core}}$ against the execution state $f(x)$ forced by the adversarial payload $x$, asserting that the core instructions can never be negated: $$f(x) \cap \neg I_{\text{core}} = \emptyset$$
To maximize execution speed, we implemented a Thread-Local Z3 Solver Pool, Lazy AST Instantiation, and used the aho-corasick crate for zero-allocation, lock-free string matching.
Challenges we ran into
Bridging a thread-unsafe C++ library (z3-sys) with Rust's highly concurrent tokio async runtime was a massive architectural hurdle. Because the Z3 Context contains raw pointers, it violates Rust's strict Send and Sync concurrency traits. We had to architect a custom thread-pool isolation layer using tokio::sync::mpsc channels to pass native Rust types (Strings and Enums) across the boundary without blocking the async executor.
Furthermore, local compilation of C++ bindings on Windows proved fragile. We solved this by containerizing the Blue Team engine using a multi-stage Docker build targeted for Debian Linux, seamlessly prepping the workload for GCP Confidential Computing.
Accomplishments that we're proud of
We successfully translated abstract, natural-language adversarial attacks into rigid mathematical constraints that a theorem prover can process. By eliminating heap thrashing (replacing to_lowercase() allocations with Aho-Corasick automatons) and building a zero-panic Rust architecture, we engineered a security engine that adds near-zero latency to the CI/CD pipeline while offering absolute deterministic security.
What we learned
Building Project Brahma was a masterclass in the intersection of Systems Programming and Applied Mathematics. We gained deep, practical experience in managing !Send memory boundaries in Rust, orchestrating asynchronous LLM tool-calling with Anthropic, and leveraging the GitLab Duo platform for headless, action-oriented automation rather than standard chat interfaces.
What's next for Project Brahma
The immediate next step is expanding the Z3 constraint models to automatically generate and commit the corresponding Rust/Python sanitization patches directly to the developer's Merge Request. Long term, we plan to integrate eBPF (Extended Berkeley Packet Filter) kernel-level tracing to catch zero-day dependency poisoning alongside the AI vulnerabilities.
Built With
- anthropic
- docker
- fastapi
- gitlab-duo
- google-cloud
- pydantic
- python
- rust
- z3


Log in or sign up for Devpost to join the conversation.