Inspiration

A while ago, a student accidentally exposed an API key in a GitHub repository and ended up facing a bill of nearly $50,000. The mistake wasn’t malicious or careless — it was simply missed during review.

In another team I worked with, a tech lead had to review the same code and pull requests multiple times, manually checking logs, tests, and files before every release. Even then, things slipped through. The problem wasn’t lack of tools — it was that there was no clear signal saying “this is ready” or “this is not.”

That led to a simple question:

What if something could give a clear heads-up before deployment — and explain why?

That question became Gatekeeper.

What Gatekeeper Does

Gatekeeper is a production readiness agent that helps teams decide whether code is safe to ship.

Instead of relying on engineers to manually dig through files, logs, and test output, Gatekeeper:

Runs code safely in isolated Daytona sandboxes

Captures failures and execution logs using Sentry

Produces a clear verdict: safe to ship or not safe to ship

Shows which files have issues and why they block deployment

The goal is not to replace existing tools, but to connect them into a single, actionable decision.

How We Built It

Gatekeeper is built as a lightweight backend agent with a clear decision loop:

A user runs Gatekeeper on a local Python repository

The agent creates a tracked run and starts observability with Sentry

Code is executed inside a secure Daytona sandbox

Tests are run and logs are collected

Failures and errors are captured and traced in Sentry

The agent evaluates results and risk signals

A final report is generated with a verdict and file-level findings

Daytona is used for safe execution of untrusted code, while Sentry provides traceability and auditability of every agent run. The output is a simple report — not a dashboard — focused on what needs attention before deployment.

Challenges We Faced

Scoping: It was tempting to analyze any repository or language, but for reliability we intentionally limited the MVP to local Python repos.

Error handling: Making sure the agent continues gracefully even when one step fails required careful handling.

Tool boundaries: Each sponsor tool is powerful on its own; the challenge was using them together without overlapping responsibilities.

Honest MVP design: Avoiding overclaiming while still delivering something useful and believable.

What We Learned

Many production issues are not caused by missing tools, but by missing decisions

Safe execution and observability together are far more powerful than either alone

A clear “heads-up” can save teams time, stress, and real money

An agent doesn’t need to be complex — it needs to be trustworthy and explainable

Why It Matters

Gatekeeper helps teams catch problems before they reach production. Instead of guessing or re-reviewing the same code repeatedly, teams get a clear signal and a reason.

Gatekeeper exists to make sure critical issues aren’t missed when it matters most.

Built With

Share this project:

Updates