Inspiration

As AI-generated code becomes more common, the risk of insecure or poorly reviewed changes grows alongside it. Rapidly built applications are increasingly exposed to security vulnerabilities, misconfigurations, and preventable deployment mistakes. This reality underscores how critical CI/CD automation and automated review systems have become. Generated code should not just work but it must also be validated against security, reliability, and deployment standards before reaching production. Argus was inspired by the need for a practical safeguard in that workflow.

What it does

Argus is a security and deployment risk scanner focused on code changes. It reviews commits, merge request diffs, and related files to surface likely issues such as hardcoded secrets, unsafe configuration changes, dependency or deployment risks, and other problems that could compromise security or release stability. Its role is to detect and describe findings that are structured for downstream agents or human reviewers.

What Argus does not do is fix issues, create patches, or make final decisions. At most, it can leave a single concise merge request note summarizing grounded findings, without exposing sensitive values.

How we built it

Argus was designed around a multi-agent workflow to separate responsibilities and keep the system modular. One agent focuses on reviewing code changes against security and deployment guidelines, handling all analysis and reasoning. A second agent is responsible for turning validated findings into a merge request issue or summary that can be surfaced to the development team.

This separation made the workflow easier to reason about and better aligned with secure code review practices. Instead of having one agent handle everything, we built specialized steps that made the system more structured, auditable, and easier to improve over time.

Challenges we ran into

One of the biggest challenges was getting the overall workflow to compile correctly and trigger when expected. We ran into several issues while configuring the pipeline and understanding how the workflow platform behaved in practice. At the start, the setup was difficult to debug because failed sessions did not always provide enough detail to clearly explain what went wrong.

Another challenge was diagnosing inconsistent failures. In some cases, the workflow would stop without an obvious explanation, which slowed iteration. Over time, we became more comfortable with the tooling and learned how to structure the flow more effectively, but we still saw room for better visibility into failure reasons. We eventually suspected some failures were related to insufficient credits or platform-level limits.

Accomplishments that we're proud of

We are proud to have built a working multi-agent security review flow that addresses a real and growing problem. Argus is not a generic code reviewer. It is focused specifically on security and deployment risk, which gives it a clear and purposeful role in the software delivery pipeline.

We are also proud of successfully separating the reasoning and reporting responsibilities into distinct agents. That design choice made the project feel more intentional and closer to how real review workflows operate. Most importantly, we created something that can help developers catch risky changes earlier, before they become larger production issues.

What we learned

Through building Argus, we learned that multi-agent workflows need clear boundaries, structured handoffs, and narrowly scoped responsibilities to work well. We also gained hands-on experience with CI/CD pipeline triggers and the practical challenges of integrating AI agents into software delivery systems.

Beyond the technical lessons, we learned that security tooling must be both reliable and explainable. It is not enough for an agent to flag a problem. It needs to do so in a way that is grounded, concise, and useful for the next step in the workflow. That made us think more carefully about trust, false positives, and how AI should support developers rather than overwhelm them.

What's next for Argus

The next step for Argus is to make its scanning capabilities broader and more precise. We want to expand the kinds of risks it can detect, improve the quality of its findings, and reduce noise so that developers receive more actionable feedback. We also want to improve observability within the workflow itself so failures are easier to understand and debug.

In the future, Argus could evolve into a more complete security gate within CI/CD. Argus could not only flags risky changes, but also prioritizes findings, integrates with team workflows more smoothly, and helps organizations adopt AI-assisted development more safely.

Built With

  • gitduo
  • yaml
Share this project:

Updates