About the project
AgentGate was inspired by a gap we noticed while working with Codex during the hackathon: AI agents are becoming capable enough to operate real infrastructure, but the access layer around them is still built for humans.
Today, password managers and secret managers usually assume an interactive human login flow. They are great for people, but they are not natively designed for autonomous coding agents that need to operate infrastructure through controlled, auditable workflows.
We also could not find a Codex skill on the market that solves this specific problem: letting Codex operate SSH servers without ever receiving the SSH key, password, or root credentials.
So we built AgentGate, a PAM layer for AI agents.
Instead of giving Codex direct credentials, Codex sends an operation request to AgentGate. AgentGate authenticates the agent through an API key, checks the target policy, connects to the server on behalf of the agent, executes the approved command, and records a full audit trail.
During the live demo, Codex can install, remove, and inspect software on real remote servers without ever seeing the SSH key or password.
What makes it different
Most credential tools today are designed around this flow:
$$ Human \rightarrow Login \rightarrow Retrieve Secret \rightarrow Use Secret $$
AgentGate uses a different model:
$$ Agent \rightarrow Authorized Request \rightarrow Policy Check \rightarrow Brokered Execution \rightarrow Audit Log $$
The agent does not retrieve the secret.
The agent does not copy the secret.
The agent does not store the secret.
AgentGate owns the connection, applies controls, executes the allowed operation, and records what happened.
That distinction matters because AI agents should not become another place where production credentials leak.
Live and working demo
AgentGate is not only a concept or a mockup. The solution is fully working and publicly testable.
We deployed AgentGate over HTTPS, connected it to real SSH servers, created a Codex skill, and verified end-to-end execution through the PAM layer.
For fast judging and testing, we provide a temporary magic login link:
https://agentgate.fucito.it/magic/tvcyjOZuvGYoBiT5mqQ57JhItloyie_2x8SWCckDWuw
The magic link is only for demo speed. Without the magic link, AgentGate uses a standard username and password login. In a production version, this would be extended with MFA, SSO, RBAC, and enterprise identity providers.
In the live demo, Codex can:
- List onboarded SSH targets
- Connect to a real remote server through AgentGate
- Install software such as Docker or Python
- Remove software
- Check disk, memory, uptime, and service status
- Execute commands without ever seeing SSH credentials
- Produce audit logs for every action
Every operation is executed through AgentGate and recorded with:
- The API key / actor that requested it
- The target server
- The human-readable operation summary
- The raw command
- The decision
- stdout and stderr
- exit code
- audit ID
The project repository is available at:
https://github.com/marcofucito/AgentGate
What we built
AgentGate includes:
- A web dashboard to onboard SSH targets once
- Encrypted credential storage
- API keys for agent access
- Policy controls for command execution
- A secure command execution API for AI agents
- Audit logs with actor, target server, command, decision, stdout, stderr, exit code, and audit ID
- Human-readable audit summaries such as
Install Docker,Uninstall Docker,Install Python 3,Check disk usage, andCheck memory usage - A per-server human review view showing which agent performed which operation
- A Codex skill integration that lets Codex use AgentGate naturally from chat
- A public HTTPS deployment with a downloadable Codex skill
How we built it
We built the MVP with:
- FastAPI for the backend
- SQLite + SQLAlchemy for persistence
- Paramiko for SSH execution
- Fernet encryption for stored secrets
- Jinja templates for the dashboard
- Docker Compose for deployment
- Nginx + Let's Encrypt for HTTPS
- DigitalOcean Droplets as real SSH targets
- Codex Skills to expose AgentGate as an agent-native capability
The core flow is:
$$ Codex \rightarrow AgentGate \rightarrow Policy Check \rightarrow SSH Target \rightarrow Human\text{-}Readable Audit $$
The key design principle is:
The agent can act, but it never owns the keys.
Security and compliance vision
AgentGate is designed to make LLM-based infrastructure operations safer and more compliant.
The current Codex skill authenticates through an AgentGate API key. In the future, this can be extended with device-bound certificates installed on approved developer machines, so only trusted Codex environments can call the PAM broker.
For production, AgentGate can evolve toward:
- MFA and SSO for human dashboard access
- RBAC for teams, environments, and server groups
- Device-bound certificates for Codex skill authentication
- Approval workflows for risky commands
- Stronger policy engines for command classification
- Integration with enterprise secret managers
- SIEM and compliance exports
- Session recording and tamper-resistant audit logs
This makes the product relevant for companies with strict security requirements, including organizations working toward SOC controls, NIS2 readiness, privileged access governance, access traceability, and auditability.
The goal is not only to make AI agents powerful. The goal is to make them safe enough for companies to actually use.
Business model
AgentGate can be sold in two ways:
- Multi-tenant SaaS for startups and modern engineering teams that want to connect AI agents to infrastructure quickly, with managed hosting, usage-based pricing, team policies, audit logs, and integrations.
- Self-hosted or on-prem deployment for enterprises, banks, defense, healthcare, critical infrastructure, and security-first organizations that cannot send credentials, audit logs, or infrastructure access through an external tenant.
This dual model matters because the most security-sensitive companies are also the ones that could benefit the most from AI agents, but they need stronger guarantees before trusting them.
With AgentGate, those companies can adopt LLM-based operators without giving the model direct access to production secrets or unrestricted systems.
What we learned
We learned that the hard part is not making an AI agent execute commands. The hard part is making that execution acceptable for real teams: secure, auditable, revocable, and understandable.
A raw command log is useful for engineers, but not enough for human review. That is why we added operation summaries and per-server audit views. A reviewer should not need to parse shell commands to understand that an agent installed Docker, removed Docker, checked disk usage, or verified Python.
We also learned that agent integrations need to feel native. Asking Codex to go through a human-first password manager flow breaks the agentic workflow. The Codex skill was important because it made AgentGate usable directly from the AI workflow instead of being just another dashboard.
Challenges
The biggest challenges were:
- Designing a PAM flow that was meaningful but still buildable during a hackathon
- Keeping credentials hidden from the AI agent while still allowing useful infrastructure actions
- Handling real SSH targets securely
- Making audit logs understandable for humans, not only machines
- Building and deploying a public HTTPS demo quickly
- Making the Codex skill installable and reliable from both the domain and IP fallback
- Working around the fact that existing credential tools are mostly human-login-first, not agent-native
- Balancing speed, security, and demo clarity under time pressure
Why it matters
AI agents will increasingly operate production systems. Without a control layer, that creates a major security and compliance gap.
We have already seen cases where LLM-driven tools or autonomous agents made destructive mistakes, such as deleting data, running unsafe commands, or acting on hallucinated assumptions. The risk is not only malicious behavior. It is also overconfident automation.
AgentGate reduces that risk by putting a policy and audit layer between the model and the infrastructure.
An LLM can hallucinate a command.
AgentGate can deny it.
An agent can request a risky operation.
AgentGate can log it, block it, or require review.
A company can use AI automation.
Security teams can still keep control.
AgentGate separates intent from credentials.
Codex can request an operation. AgentGate controls execution. Humans keep visibility.
That is the bridge companies need before they can trust AI agents with real infrastructure.
Built With
- bootstrap
- codex
- compose
- digitalocean
- docker
- droplets
- encryption
- fastapi
- fernet
- jinja
- linux
- paramiko
- python
- rest
- skills
- sqlalchemy
- sqlite
- ssh
- ubuntu
Log in or sign up for Devpost to join the conversation.