-
-
Introduction to the Prompt Security Detector within GitLab-Agent configuration (agent.yml) defines AI behavior and security logic.
-
Flow configuration (flow.yml) sets up the multi-stage analysis pipeline.
-
User navigates to the Merge Request to initiate analysis
-
Prompt Security Detector is selected from available agents / User clicks Run Agent to trigger the analysis process or send a chat to start
-
A session is automatically created to start processing.Stage 1 begins by analyzing the prompt and reading code changes
-
Agent uses tools to inspect files, diffs, and repository data
-
Agent analyze patterns using 5-layer security pipeline.Code is analyzed to detect prompt injection and unsafe patterns.
-
Detector returns structured results: safety, category, and reasoning.System classifies risk and prepares actionable insights.
-
A comprehensive security report is generated.Results are posted directly to the Merge Request as comments
-
Sample vuln pattern classification
-
Sample vuln pattern classification
-
Sample vuln pattern classification
-
Sample prompt asking for Database credentials and response given by agent
-
Layer 4 - Contextual risk assessment sample
-
Detailed Risk breakdown
-
Attack pattern
-
Recommendations
-
FINAL output
-
Risk scoring cvss score
Inspiration
As AI-powered development tools become deeply integrated into modern workflows, a new class of vulnerabilities has emerged—prompt injection attacks. These attacks can manipulate AI behavior, leak sensitive data, or override system instructions.
We noticed that while traditional security tools focus on code vulnerabilities, AI prompt security is largely unaddressed—especially within developer workflows like Merge Requests.
This inspired us to build SecurePrompt, a solution that brings real-time prompt injection detection directly into GitLab, ensuring developers can catch and fix issues before they reach production.
What it does
SecurePrompt is an AI-powered security agent that integrates into GitLab Merge Requests to:
Analyze code changes for prompt injection patterns Classify attack types and intent Use a custom Python detection engine for validation Generate structured security reports with risk scores Post inline comments and summaries directly in the MR
It works automatically when a user clicks “Run Agent”, delivering real-time, actionable feedback within seconds.
How we built it
We built SecurePrompt using a combination of:
GitLab AI Agents framework agent.yml to define behavior, tools, and system prompts flow.yml to orchestrate a multi-stage pipeline: Analyze prompt Classify attack Isolate and log Generate report
A custom Python detector:
from src.detector import PromptInjectionDetector detector = PromptInjectionDetector() result = detector.detect(code_snippet) Built-in GitLab tools to: Read MR files and diffs Analyze repository context Post results back into Merge Requests
This hybrid approach combines AI reasoning + deterministic validation for higher accuracy.
Challenges we ran into
Designing reliable detection logic for ambiguous prompt injection patterns Balancing false positives vs. real threats Integrating seamlessly with GitLab’s agent and flow architecture Mapping AI analysis to actionable developer feedback Ensuring fast execution within real-time workflows
Accomplishments that we're proud of
Built an end-to-end working prototype integrated with GitLab Achieved real-time analysis within Merge Requests Implemented a multi-layer detection pipeline (AI + Python engine) Delivered clear, structured security reports with risk scoring Bridged the gap between AI security and developer workflows
What we learned
Prompt injection is a critical and evolving security challenge AI alone isn’t enough—combining it with rule-based validation improves reliability Developers prefer security feedback directly in their workflow, not external tools Performance and usability are just as important as detection accuracy
What's next for SecurePrompt: AI-Powered Prompt Injection Detection
Expand detection coverage with more advanced attack patterns Improve accuracy using feedback-driven learning Integrate with CI/CD pipelines for continuous security validation Extend support beyond GitLab to other platforms Build a centralized dashboard for security insights and trends
Built With
- custom-detection-logic
- gitlab-ai-agents
- gitlab-apis
- nlp-techniques
- python
- yaml-(agent.yml-&-flow.yml)