Inspiration

The inspiration for EcoSecure Agent came from a frustration every developer knows but rarely talks about security tools that nobody actually uses.

The tools exist. Static analyzers, vulnerability scanners, security dashboards the ecosystem is mature. Yet the 2024 Verizon Data Breach Report found that over 80% of breaches still involve hardcoded or stolen credentials. The average time to detect a breach remains 207 days. So why isn't the problem getting better?

We realized the issue isn't the tools themselves. It's where they live. Security tools demand context switching you finish writing code, push it, and then you're supposed to stop everything, open a separate dashboard, wait for a scan, read a report, and then come back to coding. Nobody does that consistently. Context switching is the silent killer of security habits.

Then we asked a different question: where do developers actually live? The answer was obvious. Slack. Slack is open on every developer's screen, every single day. It's where standups happen, where incidents get reported, where teams communicate. And yet not a single mainstream security tool sends real-time alerts there natively on every push.

That gap became EcoSecure Agent. And when we started thinking about the environmental cost of running AI at scale a cost that's completely invisible to most teams the Green Security metric became the natural second pillar of the project.

What It Does

EcoSecure Agent is an AI-powered security agent that monitors your GitLab repository and delivers instant, actionable security alerts directly to Slack automatically, on every single code push, with zero manual steps.

The moment a developer pushes code, GitLab CI/CD triggers automatically. The pipeline extracts the code diff and sends it to the backend. Groq's Llama 3.1 model analyzes the diff for security vulnerabilities hardcoded secrets, SQL injection, weak cryptography, sensitive data in logs, authentication bypasses, remote code execution risks, and more. Within seconds, a rich formatted alert appears in Slack with the severity level, a plain-English description of the issue, and a specific, actionable fix suggestion.

Every alert also displays the carbon footprint of the AI analysis the CO2 equivalent emitted by the Groq API call, calculated from token usage. A typical scan costs approximately 0.00004 kg CO2e, roughly five times greener than a single Google search. Over thousands of scans, teams get a real, trackable sustainability metric for their AI tooling.

Three outcomes on every push: a HIGH severity alert with a red notification, a MEDIUM severity alert with an amber notification, or a clean green notification confirming no issues found. The developer never leaves Slack. The agent comes to them.

How We Built It

We built EcoSecure Agent as a lean, fully free-tier stack across four platforms.

The interface layer is Slack's Incoming Webhooks API. We use Block Kit to compose rich, structured messages with color-coded severity, organized fields, and direct links back to the GitLab merge request. No Slack app installation required just a webhook URL, making it deployable to any workspace in under two minutes.

The agent identity lives in GitLab. A config.yaml file inside .gitlab/agents/green-security-agent/ registers EcoSecure as a proper GitLab Agent with defined permissions and namespace access. This is what makes it a first-class GitLab Agent rather than a standalone script.

The trigger is GitLab CI/CD. The .gitlab-ci.yml pipeline runs on every push, uses git diff to extract only changed Python, JavaScript, and TypeScript files, writes the diff to a temp file, and uses Python's requests library to POST it to the backend — avoiding shell escaping issues that plagued earlier curl-based approaches.

The backend is a Flask application deployed on Render. It receives the diff, constructs a structured security analysis prompt, calls the Groq API, parses the JSON response, calculates the CO2e metric from token usage, builds the Slack Block Kit payload, and posts the alert. The entire analysis logic is approximately 100 lines of Python.

The AI layer is Groq running Llama 3.1 8B Instant. We chose Groq specifically for its sub-second inference speeds critical for a tool that runs on every push and its generous free tier. We prompt the model to return structured JSON with four fields: has_issue, severity, issue_summary, and fix_suggestion, then handle markdown fence stripping for cases where the model wraps its response.

Challenges We Ran Into

The first major challenge was the Groq SDK version conflict. The pinned version in our initial requirements file conflicted with the installed httpx version, producing a cryptic proxies argument error. The fix was simple remove the version pin and let pip resolve the latest compatible version but diagnosing it cost significant time.

The second challenge was shell escaping in GitLab CI/CD. Our initial approach used curl with an inline shell-constructed JSON payload. Code diffs contain quotes, apostrophes, newlines, and special characters that completely break shell string interpolation. After several failed approaches, we solved it by writing the diff to a temp file and using Python's requests library with native JSON serialization, which handles all escaping automatically.

The third challenge was CI/CD variable injection. Our GitLab variables were marked as Protected, which means they only inject into pipelines running on protected branches. Since we had no protected branches configured, the variables were silently empty producing an Illegal header value b'Bearer ' error from Groq because the API key was an empty string. Unmarking the variables as Protected resolved it immediately.

The fourth challenge was finding a free hosting alternative to GCP Cloud Functions. Without GCP credits, we needed a platform that could host a persistent Python web server, auto-deploy from GitLab, and handle reasonable request volumes on a free tier. Render solved all three requirements, though it required switching from Google's functions-framework to Flask with Gunicorn, and adding a Procfile and .python-version file for correct runtime configuration.

The fifth challenge was the model being too strict for our "clean code" test cases. Even pure math utility functions got flagged — integer overflow in a multiply function, missing input validation in string utilities, improper salt storage in SHA-256 hashing. We reframed this as a feature: a security agent that catches edge cases developers wouldn't think to check is more valuable than one that only catches obvious issues.

Accomplishments That We're Proud Of

We are proud that the entire stack runs on free tier infrastructure. GitLab, Render, Groq, and Slack — zero infrastructure cost. Any engineering team in the world can fork this repository, add three environment variables, and have a production-grade AI security agent running in under fifteen minutes.

We are proud of how fast it works. From git push to Slack alert in under 30 seconds. For a tool that runs on every single commit, speed isn't a nice-to-have it's the difference between developers noticing alerts and ignoring them.

We are proud of the Green Security metric. No other security tool we found displays the carbon cost of its AI analysis. Making AI energy consumption visible is a small step, but it's a meaningful one. It starts a conversation that the industry needs to have.

We are proud that the agent is genuinely intelligent, not pattern-matching. It doesn't just grep for the string password =. It understands context it caught a ReDoS vulnerability in a regex pattern, identified insecure deserialization via pickle, and flagged catastrophic backtracking in an email validator. That level of analysis is only possible with a large language model.

We are proud that we built this end-to-end in under two hours, debugging real infrastructure problems in real time, and shipped something that actually works.

What We Learned

We learned that the hardest problems in developer tooling are almost never technical they're behavioural. The best security tool is the one developers actually use, not the most sophisticated one. Designing for the path of least resistance, putting the alert where the developer already is rather than where the tool wants them to be, is more impactful than any algorithmic improvement.

We learned that free tier infrastructure is genuinely production-capable in 2025. Render, Groq, GitLab CI/CD, and Slack webhooks none of these cost a dollar, and together they form a stack that handles real workloads reliably. The barrier to shipping AI-powered tools has never been lower.

We learned that shell scripting in CI/CD pipelines is surprisingly fragile. Code diffs are adversarial inputs from a shell escaping perspective they contain every character that breaks string interpolation. Moving payload construction into Python eliminated an entire class of bugs.

We learned that LLMs make genuinely useful security reviewers. Llama 3.1 caught vulnerabilities we hadn't explicitly prompted for it identified an insecure deserialization pattern, a path traversal vulnerability in a file read function, and a weak random number generator used for password reset tokens. The model's training on security literature makes it a surprisingly capable analyst even on a small 8B parameter model.

We learned to embrace unexpected results. Every time the model flagged something in our "clean" test code, our instinct was to fix the prompt. But the model was usually right. Integer overflow is a real vulnerability. Missing input validation is a real vulnerability. A strict model is a better security tool than a lenient one.

What's Next for EcoSecure Agent

The immediate next step is multi-repository support. Right now the agent watches a single repository. The natural evolution is a central EcoSecure workspace in Slack that aggregates alerts from every repository across an organization, with routing rules that send alerts to the right channel based on which repo triggered them.

The second roadmap item is a carbon dashboard. Right now CO2e appears in individual alerts. The vision is a weekly Slack digest total scans run, total vulnerabilities caught, total carbon consumed by AI analysis, trend over time. Engineering managers get a sustainability report without any manual work.

The third item is severity thresholds and policies. Teams should be able to define rules block a merge request if a HIGH severity issue is found, require a human approval for MEDIUM issues, auto-dismiss LOW issues. This turns EcoSecure from a passive notifier into an active gatekeeper.

The fourth item is expanding language support. Currently we scan Python, JavaScript, and TypeScript. Adding Go, Rust, Java, and Ruby would cover the vast majority of production codebases.

The fifth item is fine-tuning. Running a dataset of real-world vulnerabilities and their fixes through a fine-tuning pipeline on a larger model would reduce false positives and improve the specificity of fix suggestions moving from "use environment variables" to "here is the exact code change to make."

The long-term vision is EcoSecure as the security layer of every GitLab-powered engineering team invisible, automatic, and already in the tool you're using right now. Secure code, sustainable AI, zero context switching.

Built With

  • flask
  • gitlab-agent
  • gitlab-ci/cd
  • groq
  • gunicorn
  • llama-3.1-8b-instant
  • python
  • render
  • slack-block-kit
  • slack-incoming-webhooks-api
Share this project:

Updates