Inspiration
Woke up one day wanting to contribute to our org's servers. Opened the repo,
poked around — and there it was. A .env file. Just sitting there, pushed
like it was README.md. Then I remembered another one — a GCP credentials
JSON on someone else's repo I had access to. Both repos handled auth. If
either of those were public, it's cooked. Simple leak. Massive consequences.
Friday night downtime. Monday morning incident report.
I'm a 1st year BS CpE student at Bulacan State University, co-founder and CTO of Seekers Guild, and apparently someone who cannot let things go. So instead of just sending a "hey bro your .env is public" message and moving on, I built Bantay. Filipino for "guardian" — the person who stays awake so everyone else can sleep safely.
What it does
Bantay is a pre-push git hook with a brain. When you run git push, it
intercepts before anything hits the remote and runs two layers of analysis:
Layer 1 — Fast regex + secretlint: catches obvious secrets immediately.
AWS keys, .env files, GCP credentials, private keys, hardcoded tokens —
flagged before the LLM even wakes up.
Layer 2 — LLM risk scoring: ambiguous findings — high entropy strings, dynamic credential patterns — get sent to a Vultr-hosted Qwen 2.5 Coder model for context-aware scoring. The LLM never sees raw secret values. It receives a masked metadata envelope only — pattern type, file, line number, masked value. Enough context to reason about risk, not enough to leak anything.
Then based on risk tier:
| Tier | Action |
|---|---|
| LOW | Push goes through immediately |
| MEDIUM | Auth0 CIBA fires → ntfy notification → Guardian approval → terminal resumes |
| HIGH | Auto-blocked, no questions asked |
Fail-closed by design. LLM timeout? Blocked. CIBA expiry? Blocked. Network error? Blocked. If the guardian can't do its job, nothing gets through.
How we built it
The core is a LangGraph StateGraph — a graph-based pipeline with nodes
for scanning, scoring, and deciding. MEDIUM risk triggers a LangGraph
Interrupt, which pauses the graph and hands control to the Auth0 CIBA flow.
When the user approves on Guardian, the graph resumes and the push goes
through.
git push
→ pre-push hook
→ bantay scan
→ regex + secretlint (Layer 1)
→ LLM risk scoring via Vultr (Layer 2)
→ LangGraph decision node
→ LOW → allow
→ MEDIUM → interrupt → CIBA → ntfy → Guardian → resume
→ HIGH → block
Auth0 CIBA handles the human-in-the-loop approval. Normal auth flows are synchronous — you're stuck waiting. CIBA is async. Auth0 pushes a notification to Guardian on your phone, you tap approve or deny, and the terminal resumes on its own. The approval happens out-of-band on a separate trusted device — a completely different threat model from a terminal prompt.
Token Vault wraps all GitHub API calls — the agent never handles raw tokens. Zero raw credentials in code, ever.
AES-256-GCM encrypted secrets stored locally in ~/.bantay/secrets
with a random master key persisted to the shell rc file. Nothing plaintext,
nothing in .env.
ntfy handles out-of-band push notifications to the developer's phone
when CIBA fires. Self-hosted at ntfy.kuyacarlo.dev with lazy auth
detection — tries unauthenticated first, prompts on 403, saves credentials
encrypted for future use.
The monorepo is split into @bantay/core (detection engine, LangGraph
pipeline, Auth0, secrets) and @bantay/cli (commands, hook installer,
terminal output). 47 tests, 94.37% statement coverage.
Challenges we ran into
The stdin hang. The original pre-push hook used readFileSync("/dev/stdin")
to read git's push ref info — a blocking call. If nothing was piped in, it
hung forever with no output. Fixed with an async stdin read with a 500ms
timeout and a graceful fallback chain.
Identity mapping 403. Auth0 CIBA requires a primary auth0| identity
but GitHub social login returns github| IDs. The Management API call to
resolve this was 403ing because the app type (Regular Web Application)
doesn't support M2M grants. Spent significant time debugging — eventual fix
was testing CIBA directly via curl with the github| ID and discovering
Auth0 Guardian accepts it natively as long as MFA enrollment exists.
Initial commit edge case. git diff HEAD~1..HEAD throws on a repo's
first commit because there's no parent. Fixed with a getGitContext()
utility that detects isInitialCommit and routes to git show -p HEAD
instead.
readline vs HTTP server conflict. The login flow spawned a local HTTP server for the OAuth callback while simultaneously trying to prompt for ntfy credentials via readline. Stdin conflict — the server never received the callback. Fixed by collecting all inputs before the server starts.
Password input echoing. Standard readline echoes everything typed.
Raw mode implementation broke on paste — crashed the terminal entirely.
Solved with the prompts library which handles hidden input and paste
natively across all terminal types.
Accomplishments that we're proud of
The full CIBA loop working end to end — git push triggers the hook,
Bantay scans and finds a source map, CIBA fires, ntfy delivers the
notification to the phone, Guardian push notification appears, tap approve,
terminal prints ✅ Authorization approved. Push allowed. That whole flow,
working, in a hackathon.
The metadata envelope design — raw secret values never touching the LLM — is something we're genuinely proud of. It's the right pattern for AI security tooling and it wasn't obvious until we thought hard about the threat model.
94.37% test coverage on the core library in a 72-hour hackathon. TDD actually paid off — caught several edge cases before they became demo failures.
The bantay login flow going from zero to fully configured in one command —
Auth0 OAuth, encrypted secrets, ntfy auth probe, Guardian enrollment check —
with no .env file, no manual configuration, no copy-pasting tokens.
What we learned
Auth0 CIBA is one of the most underused features in the identity space. It's exactly the right primitive for human-in-the-loop security tooling — async, out-of-band, on a trusted separate device. The moment it clicked was the moment the whole project made sense.
Fail-closed is a design decision, not a feature. You decide it once at the start and everything else follows from it. Every error handler, every timeout, every fallback becomes obvious.
The metadata envelope pattern — masking secrets before LLM inference — is something every AI security tool should be doing by default. It's not obvious until you think about what happens if the API call gets intercepted or logged on the provider side.
Building a CLI tool that works reliably across terminal types, shells, and git edge cases is genuinely harder than building the AI parts. More edge cases than features.
What's next for Bantay
- v0.2 — CI integration:
bantay scan --cifor GitHub Actions, block PRs the same way it blocks local pushes, status checks on pull requests - v0.3 — In-house detection engine: custom policy DSL in
.bantay.yaml, no secretlint dependency, configurable risk thresholds per branch and file pattern - v1.0 — Team visibility: dashboard showing what's being flagged across the engineering team, audit log of all blocked and approved pushes, Slack and Discord integration
- v1.1 — Multi-tenant orgs: org-level Auth0 tenants, role-based approval routing, SSO support
- Maybe v2 — MCP integration: bantay as an MCP server for AI coding agents, so the guardian watches the agents too
built in 72 hours. one .env file living rent-free in my head the entire
time.
tara na, nandito na ang bantay...
Bonus Blog Post: How Token Vault Changed the Way I Think About Agent Credentials
I almost didn't use Token Vault the way it was intended.
My first instinct was to store the GitHub token directly in the encrypted
~/.bantay/secrets file alongside everything else — AES-256-GCM, safe
enough, right? The agent reads it, makes the API call, done.
Then I thought about what that actually means. The agent — a LangGraph node running inside a git hook — would be handling a raw GitHub OAuth token. Decrypting it, holding it in memory, passing it to an HTTP call. If anything in that pipeline logged, crashed with a stack trace, or got intercepted, the token is exposed. The agent becomes the weakest link.
Token Vault solves this by removing the token from the agent entirely. The agent never sees the raw credential. It calls the protected tool, Token Vault handles the OAuth exchange and injects the token at the HTTP layer, and the response comes back clean. The agent operates on data, not on secrets.
For Bantay, this meant the GitHub API call for repository metadata — checking
whether a repo is public or private — happens without the scanning agent ever
touching the token. The LangGraph node that calls getRepoVisibility() has
no idea what credential is being used. It just gets an answer.
This pattern matters more than it seems. As AI agents get more capable and more integrated into developer workflows, the question of what the agent knows becomes a security boundary in itself. An agent that holds credentials is an agent that can leak credentials — through logs, through errors, through prompt injection, through a compromised dependency in the tool chain.
Token Vault enforces least privilege at the identity layer. The agent is authorized to act, but it is not authorized to know. That distinction is exactly what zero-trust engineering looks like when applied to AI.
Building Bantay taught me that the right architecture for agentic AI isn't just about what the agent can do — it's about what the agent is allowed to see. Token Vault is the primitive that makes that boundary real.
— Karlo, 1st year BS CpE @ BulSU, CTO @ Seekers Guild
Built With
- aes-256-gcm
- auth0
- ciba
- git
- langgraph
- linux
- node.js
- ntfy
- pnpm
- qwen
- secretlint
- token
- tsup
- typescript
- vault
- vultr
Log in or sign up for Devpost to join the conversation.