VibeScan — AI-Generated Code Security Scanner

terminal output
html report
dashboard

Inspiration

45% of AI-generated code has security vulnerabilities — that's what pushed me to build VibeScan. I watched friends ship Cursor/Claude code with hardcoded keys, eval(), and debug mode on in production. Existing tools like GitLeaks only check for secrets. They miss what AI actually writes.

What it does

VibeScan is a CLI scanner built for AI-generated code — catches what GitLeaks misses. Uses three layers:

Regex (14 patterns): AWS, OpenAI, GitHub, Stripe, DB URLs — detects in milliseconds
Shannon entropy: H = -Σp·log₂p, scores high-entropy unknown tokens (threshold ≥ 4.5)
AI-specific patterns (24 patterns): Missing auth, SQL injection, debug mode, CORS wildcards, eval(), pickle.loads(), yaml.load(), path traversal

Plus live token validation (GitHub + OpenAI REST APIs), auto-fix safe-code snippets, Git hooks, HTML reports with a 0–100 security score (A–F grade), and GitHub PR bot.

How we built it

Pure Python — no backend, no database, no cloud. os.walk for recursive scanning, re.search across layered pattern engines, Click + Rich for CLI, Jinja2 for HTML dashboards. GitHub Actions runs on every push. Built in modular phases: secret patterns → entropy → AI patterns → token validator → reporting → hooks → PR bot.

Challenges we ran into

False positives from fixture files and docs triggered real patterns. Solved with .vibescan.yml — allowlist, path exclusions, entropy threshold tuning, baseline mode. Entropy matched both Base64 and Hex for the same string — added fingerprint deduplication. Regex patterns needed real-world validation against messy AI-generated code — 8 grew to 36+.

Accomplishments that we're proud of

Fully offline, zero API cost — works on an airplane
38 patterns + entropy, 3-layer detection
0–100 scoring dashboard readable enough for non-technical stakeholders
Every finding ships with a fix snippet — cuts fix time from 30 min to 2 min
16 unit tests covering entropy and scanner logic

What we learned

Entropy is a signal, not a detector — threshold tuning against real repos is non-negotiable. Regex for security is conservative by necessity — broad enough to catch, tight enough to not match print("hello"). False positive reduction isn't a bug fix, it's the product.

What's next

--tool flag for NPM/PyPI supply chain attacks, --explain plain-English findings, SARIF output for GitHub Code Scanning, VS Code extension for inline scanning.

Built With

actions
api
click
colorama
github
jinja
openai
python
pyyaml
requests
rest
rich

Updates

Tanisha Kushwah started this project — May 19, 2026 05:29 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.