Inspiration
Every day, millions of developers paste proprietary code into AI tools — ChatGPT, Claude, Copilot — to get help debugging, refactoring, or writing new features. Every paste leaks variable names that reveal business logic (customer_revenue, fraud_score), function names that expose architecture (sync_patient_records), string literals with internal URLs, and comments explaining trade secrets.
Companies spend millions on firewalls, VPNs, and DLP tools — but their most sensitive IP walks out the door one prompt at a time. We built GhostCode to stop that.
What it does
GhostCode is a VS Code extension that acts as a privacy proxy between developers and AI tools. Before you share code with any AI, GhostCode replaces every user-defined symbol with an opaque token: Your code What the AI sees
def calculate_revenue(transactions): def gf_001(gv_001): total_income = 0 gv_002 = 0 for txn in transactions: for gv_003 in gv_001: if txn.is_verified: if gv_003.gv_004: total_income += txn.amount gv_002 += gv_003.gv_005 return total_income return gv_002
The AI gives you a working answer using ghost tokens. GhostCode then restores all original names — your code is fully functional, and the AI never saw your business logic.
4 Privacy Levels:
- Level 1 — Rename symbols + strip comments
- Level 2 — + Scrub domain-revealing strings and numbers
- Level 3 — + Isolate a single function with dependency stubs (AI only sees what you choose)
- Level 4 — + Generalize dimensions and loop bounds
Key Features:
- AST-based parsing (not regex) — understands scope, distinguishes your code from stdlib/frameworks
- Ghost Map sidebar with full symbol mapping visualization
- Smart literal classification — scrubs domain indicators, keeps math constants
- Function isolation — extract one function, stub everything else
- Risk Report — pre-send exposure assessment (LOW / MEDIUM / HIGH)
- AI change detection — after reveal, see exactly what the AI modified
- Audit log dashboard — immutable JSONL logs with SHA-256 hashes for compliance
- Repo-level security policies via .ghostcode.yaml
- Encrypted ghost maps (AES-128-CBC)
- Python and C/C++ support
- Zero-config setup — CLI bundled inside the extension, no pip install needed
How we built it
- VS Code Extension (TypeScript) — Commands, UI, tree views, webview panels, CodeLens, decorations
- Python CLI (bundled) — AST parsing, symbol renaming, literal scrubbing, function isolation, map encryption, audit logging
- Architecture: The extension spawns the Python CLI with the source file and privacy level. The CLI parses the AST, builds a bidirectional ghost map, transforms the code, and returns the ghost file. The map stays local — only the ghost code leaves your machine.
Challenges we ran into
- Scope-aware renaming: The same variable name in different scopes needs different tokens. We solved this with full AST traversal that tracks scope chains.
- Literal classification: Not all strings should be scrubbed — "utf-8" and "\n" are safe, but "patient_records_db" is not. We built a classifier that uses heuristics (length, patterns, known safe values) to decide SCRUB, KEEP, or FLAG.
- Function isolation: Extracting a single function from a class while generating valid stubs for its dependencies required careful handling of indentation, multi-line signatures, and class context.
- Cross-platform compatibility: Windows Python environments have unique challenges (CP1252 encoding defaults, Microsoft Store aliases masquerading as python.exe) that required targeted fixes.
What we learned
- AI tools are incredibly powerful at reverse-engineering intent from structure alone — even with all names replaced, patterns like frequency-stepping arrays and power-budget conditionals can reveal the domain. True privacy requires scrubbing literals, isolating functions, and minimizing structural fingerprints.
- The gap between "anonymization" and "privacy" is wider than most developers realize. Find-and-replace is not enough — you need AST awareness, scope tracking, and literal classification.
What's next for GhostCode
- More languages — JavaScript/TypeScript, Java, Go, Rust
- Native TypeScript parser — eliminate the Python dependency entirely
- Aggressive literal scrubbing — scrub dictionary keys, attribute names, and structural patterns that leak domain context
- Team sharing — secure cloud sync for ghost maps across team members
- CI/CD integration — auto-ghost before code reaches external AI APIs
Log in or sign up for Devpost to join the conversation.