VibeGuard — A Guardian for Vibe Coder

Inspiration

We kept seeing the same pattern in our friends' projects.

You're a small business owner. You used AI to build your web app — Stripe checkout, customer database, all of it. Then you need help. A contractor for a new feature. An advisor reviewing your architecture. A friend giving design feedback. The moment any of them touches your codebase, their AI agents get full access by default. The contractor's Cursor sees your Stripe live key. The advisor's Claude sees your customer database. Your codebase has no boundaries.

This isn't hypothetical. Recent research found that roughly 20% of vibe-coded production apps expose live secrets — and the risk grows once contractors and their AI agents are pulled into the workflow, because suddenly production infrastructure and sensitive code travel far beyond the original builder.

The hackathon theme — "Build what agents want" — shaped our approach. Agents don't need unrestricted access; they need well-defined scope. That became the foundation for VibeGuard.

What it does

VibeGuard is a secure handoff layer for AI-native development workflows in Cursor and Claude Code.

A project owner can simply type: “Hand off the frontend to Sarah.” VibeGuard then:

Analyzes the repository using Nia and Greptile to identify secrets, sensitive logic, APIs, and risk areas.
Guides the owner through a short plain-English workflow to determine what should and shouldn’t be shared.
Generates a sanitized project twin with mocked APIs, fake credentials, and protected infrastructure boundaries.

The contractor receives a fully functional development environment that preserves the original developer experience without exposing production secrets or internal systems.

VibeGuard also generates two concise briefs:

An owner-side memory file (vibeguard-owner-memory.md) summarizing detected risks, this handoff's decisions, and recommended credential rotations.
A contractor-side brief (vibeguard-contractor-brief.md) living inside the sanitized workspace — read by the contractor's AI agent first, telling it the scope, what's mocked, and what's off-limits.

Scale your team without expanding your attack surface.

How we built it

We intentionally separated policy, logic, and interaction into distinct layers:

The MCP server (Python) handles deterministic operations such as scanning repositories, classifying files, sanitizing code, and applying transformations.
The skill layer (Markdown) defines the conversational workflow and translates user intent into structured actions.
The host LLM inside Cursor or Claude Code manages the interaction itself, allowing us to leverage the user’s existing AI environment without requiring additional APIs or subscriptions.

We integrated:

Nia for codebase indexing and contextual understanding.
Greptile for secret detection and security scanning.

The system is powered by Python, MCP tooling, repository patching workflows, and a comprehensive automated test suite.

Challenges we ran into

One of the biggest architectural decisions was determining where the LLM should live. Early versions embedded question-generation directly inside the MCP server, but that introduced extra API costs and reduced conversational context. Moving intelligence into the skill layer made the workflow simpler, cheaper, and more adaptable.

Another challenge was defining what qualifies as “sensitive.” Some risks are obvious, like API keys. Others are contextual — internal business logic, infrastructure connectors, or backend orchestration files that technically aren’t secrets but still shouldn’t be broadly shared.

We also discovered that repository sanitization is highly framework-specific. Refactor patches that work for one stack rarely transfer cleanly to another, so we designed the system around extensible advisory-driven transformations.

Finally, balancing realism with demo speed was difficult. Fully provisioning live infrastructure during a short demo added too much overhead, so we created a deterministic flow that accurately demonstrates the handoff experience in under two minutes.

Accomplishments we're proud of

Built a fully working end-to-end secure handoff workflow using a real MCP server and production-style repository transformations.
Achieved strong evaluation results, including major improvements in secure delegation workflows and high resistance to adversarial data extraction attempts.
Shipped a complete automated test suite covering classification, sanitization, patching, templates, and repository transformations.
Designed the system to run entirely on the user’s existing Cursor or Claude Code subscription with no additional API dependencies.
Successfully combined multiple MCP tools into a unified security workflow focused on AI-native development.

What we learned

Skills and workflow design matter as much as model capability. A well-structured conversational layer dramatically improved usability for non-technical founders.
Deterministic infrastructure makes agent workflows more reliable. Keeping the MCP layer predictable reduced errors and simplified debugging.
Limiting access actually improves agent performance. Scoped environments reduced mistakes and prevented accidental exposure of sensitive data.
Most founders don’t want security training — they want secure defaults built directly into their workflow.

What’s next for VibeGuard

Expanding framework support across Next.js, SvelteKit, Remix, FastAPI, and more.
Moving from one-time handoff protection to continuous monitoring during development.
Supporting role-based contractor environments with different levels of repository access.
Adding automated credential revocation and rotation after engagements end.
Launching a managed cloud-hosted version for teams that prefer not to run MCP infrastructure locally.

VibeGuard helps AI-native teams move fast without exposing what matters most.