Inspiration

What it does

VibeCheck — AI Security for the Age of Vibe Coding

Inspiration

We've all seen it. A founder spins up a full SaaS app in a weekend using Bolt, Lovable, or Cursor. It has auth, a database, payments, an API — and it's live by Sunday night. The problem? 45% of AI-generated code contains security vulnerabilities (Veracode, 2025). Hardcoded secrets. SQL injection. Broken auth. Unauthenticated admin routes. All shipped to production without a second thought.

We are both developers who vibe-code. We've done it ourselves. And we realized there was no tool built specifically for this new category of builder — someone who moves fast, ships often, and has never run a security audit in their life.

That's what inspired VibeCheck. Not another enterprise security tool buried in dashboards and CVE databases. Something dead simple: paste your GitHub repo, get your vulnerabilities in plain English, click fix, done.


What It Does

VibeCheck is a two-tier AI security scanner for GitHub repositories.

  1. Connect your GitHub — OAuth authentication gives VibeCheck read access to your repos
  2. Select a repo — pick any project you've built
  3. Scan — VibeCheck runs a live streaming scan across every file
  4. See your Security Score — a single number from 0–100 with a full breakdown of every vulnerability found, sorted by severity
  5. Fix with one click — VibeCheck generates a patched code snippet for every finding and pushes it as a GitHub PR to your repo

Results stream to your screen in real time as each file is analyzed — you don't wait for the full scan to finish before seeing your first critical finding.


How We Built It

The AI Pipeline

We designed a two-tier pipeline that balances speed and depth:

Tier 1 — Featherless AI (Fast Pass)
Every file in the repo is sent to Qwen2.5-Coder-32B-Instruct via Featherless serverless inference. This classifies each file as HIGH, MEDIUM, LOW, or CLEAN risk in 1–2 seconds per file, running 4 files concurrently.

Tier 2 — IBM watsonx.ai (Deep Scan)
Only HIGH and MEDIUM files are escalated to IBM Granite (ibm/granite-34b-code-instruct). Granite performs forensic-level analysis — exact line numbers, CVE categories, severity scores, and plain-English fix suggestions. Watson NLU runs in parallel on the README to detect app context (payments? user auth? public-facing?) and adjusts severity scores accordingly.

The math behind the security score:

$$ S = \max\left(0,\ 100 - \min(25c,\ 75) - \min(10h,\ 40) - \min(5m,\ 20) - \min(2l,\ 10)\right) $$

Where $c$, $h$, $m$, $l$ are the counts of CRITICAL, HIGH, MEDIUM, and LOW findings.

Streaming Architecture
Rather than making users wait for a full scan, results stream via Server-Sent Events (SSE). The UI populates live as each file finishes. Watson enrichment runs as a background job via Vercel's waitUntil — the user sees partial results in ~25 seconds while deeper analysis continues silently.

The Auto-Fix Engine

When a user clicks "Apply Fix", the vulnerability's code snippet is sent back to IBM Granite with a strict prompt: generate the minimal patch to fix this specific issue without changing any surrounding logic. The diff is rendered in a side-by-side viewer. One more click opens a GitHub PR on a new branch (vibecheck/fix-{id}) with a detailed description of what was fixed and why.

The Stack

Layer Technology
Frontend Next.js 14, TypeScript, Tailwind CSS, shadcn/ui
AI — Fast Pass Featherless AI · Qwen2.5-Coder-32B-Instruct
AI — Deep Scan IBM watsonx.ai · Granite-34b-code-instruct
AI — Context IBM Watson NLU
Storage IBM Cloud Object Storage + IBM Cloudant
Auth NextAuth.js + GitHub OAuth
Deployment Vercel

Challenges We Faced

The latency problem was our biggest technical hurdle. IBM watsonx.ai produces exceptional results but takes 10–15 seconds per file. A repo with 20 high-risk files would take over 5 minutes sequentially — completely unusable. We solved this by removing watsonx from the hot path entirely. Featherless handles real-time classification and produces immediate findings. Watsonx runs as a background enrichment pass after the user already sees their results, updating the UI silently via SSE as it completes.

The Vercel setImmediate trap caught us early. Detaching background work with setImmediate doesn't keep a Vercel serverless function alive after returning a response. We switched to waitUntil from @vercel/functions which is the only correct mechanism for fire-and-continue workloads in serverless environments.

Featherless rate limits required building an exponential backoff retry layer with jitter on every API call. A single 429 from Featherless used to stall an entire batch. Now it retries gracefully and the scan continues.

Coordinating two parallel workstreams — we built this as two developers working simultaneously on separate git branches. The critical unlock was defining all TypeScript interfaces in lib/types.ts and a MOCK_SCAN_REPORT constant before splitting. Stream B built the entire UI against mock data while Stream A built the backend, and the integration was a near-seamless one-line swap.


What We Learned

  • Streaming UX changes everything. A scan that "takes 30 seconds" feels instant when results appear after 5 seconds and keep coming. Perceived performance matters as much as actual performance.
  • IBM Granite is genuinely excellent at code security analysis. The quality of its vulnerability descriptions and fix suggestions outperformed our expectations — specific, accurate, and actionable without hallucinating CVEs.
  • Serverless architecture forces you to think differently about long-running work. The background enrichment pattern we ended up with is more robust than the synchronous pipeline we started with.
  • Featherless AI's model flexibility is a real differentiator. Being able to swap security-specialized models per language — Python repo gets one model, JavaScript gets another — without managing any GPU infrastructure is genuinely powerful.

What's Next

  • Bulk Fix — fix all critical vulnerabilities in one PR
  • CI/CD Integration — GitHub Action that runs VibeCheck on every push and blocks merges if the security score drops below a threshold
  • IDE Plugin — scan as you type, not just before you deploy
  • Subscription tiers — free for public repos, paid for private repos with unlimited scans and team dashboards ## How we built it

Challenges we ran into

Accomplishments that we're proud of

What we learned

What's next for VibeCheck

Built With

  • featherless-ai
  • github-oauth
  • ibm-cloud-object-storage
  • ibm-cloudant
  • ibm-iam
  • ibm-watson-nlu
  • ibm-watsonx.ai
  • next.js-14
  • nextauth.js
  • octokit
  • server-sent
  • shadcn/ui
  • tailwind-css
  • typescript
  • vercel
Share this project:

Updates