-
-
AirLock creates a visible checkpoint between sandboxed AI agents and real external systems.
-
The control panel shows system readiness, pending crossings, receipts, and live observability in one place.
-
Operators can inspect verified source evidence, the exact outbound payload, and the event timeline before approving an action.
-
Every approved action becomes a durable receipt with provenance, payload integrity, and execution metadata.
-
AirLock blocks unsafe or misrouted actions before execution and explains the policy decision in plain language.
-
GitHub is used for source verification and Slack for outbound delivery, with privileged actions mediated by AirLock.
Inspiration
We built AirLock because AI agents are getting good at proposing actions, but most products still treat the jump from "suggestion" to "real-world execution" as a black box. We wanted to create a visible checkpoint between a sandboxed agent and external systems like Slack, where every action has to cross an intentional border first. The core idea was simple:
$$ \text{Safe agent action} = \text{verified source} \times \text{policy checks} \times \text{human approval} $$
Instead of asking users to blindly trust an agent, AirLock makes trust inspectable.
What it does
AirLock is a human-in-the-loop control layer for AI agents. A local companion can propose an outbound action, like escalating a GitHub issue to Slack, but that action cannot leave the sandbox until AirLock verifies the source, evaluates policy, and presents the exact payload for operator review.
The product shows pending crossings, blocked attempts, sent receipts, event timelines, and companion heartbeat status in one dashboard. Operators can approve or deny each crossing, and once approved, AirLock records a receipt with provenance, timestamps, and execution details.
How we built it
We built AirLock as a Vite + React + TypeScript app with a Supabase backend. On the frontend, we used React Router for the app flow, TanStack Query for data fetching and cache invalidation, Supabase Realtime for live updates, and a shadcn/Radix-based component system for the control panel UI. We also used Framer Motion to make the review experience feel deliberate without adding noise.
On the backend, we used Supabase Edge Functions to power the core workflow: intents receives proposed actions, status reports system readiness, heartbeat tracks whether the local companion is online, crossings handles review/detail/approve/deny flows, and demo seeds deterministic scenarios. We also added payload hashing, idempotency keys, policy evaluation, and event logging so every crossing has an auditable lifecycle from receipt to outcome.
Challenges we ran into
The hardest challenge was designing the trust boundary itself. It is easy to build a UI that says "approve" or "deny," but much harder to guarantee that the human is approving the exact payload that will be sent. That forced us to think carefully about payload hashing, state transitions, and preventing stale or mismatched reviews.
Another challenge was balancing realism with demoability. We wanted AirLock to feel like a real control plane, but we also needed deterministic demo scenarios for hackathon judging, local testing, and repeatable flows. Building mock-safe provider behavior without losing the architecture of a real system took more iteration than expected.
Accomplishments that we're proud of
We are proud that AirLock feels like a real product, not just a concept demo. The end-to-end flow is there: agent intent ingestion, source verification, policy blocking, human review, outbound execution, and receipts with timelines.
We are also proud of the operator experience. The dashboard makes the crossing model understandable at a glance, and the review drawer makes it very clear what came in, why it is being proposed, where it is going, and what happens if you approve it. On top of that, we built a serious test stack with frontend unit tests, edge-function integration tests, E2E coverage, and deterministic demo controls.
What we learned
We learned that AI safety becomes much more concrete when you turn it into product primitives. "Human in the loop" is not enough on its own; you also need provenance, policy, integrity checks, and a clean operator interface. Trust is not a single feature, it is a chain.
We also learned that realtime systems feel dramatically more credible when operators can see state changes, receipts, and liveness as they happen. In other words, observability is not just an ops concern here, it is part of the user experience.
What's next for AirLock
Next, we want to replace demo-mode assumptions with real provider integrations, including live GitHub verification and real Slack execution through connected accounts. We also want to expand the policy engine so teams can define allowed destinations, message rules, escalation paths, and approval policies that match their actual operations.
Beyond that, we see AirLock becoming a general execution gateway for AI agents, not just for GitHub-to-Slack flows. Email, ticketing systems, on-call tooling, internal admin actions, and production runbooks all need the same thing: a controlled border where proposed actions can be verified, reviewed, and recorded before they cross into the real world.
Built With
- css
- html
- javascript
- react
- sql
- supabase
- tailwind
- typescript
Log in or sign up for Devpost to join the conversation.