CodeGuardian - AI-Powered Security Audit Agent
Inspiration
Security vulnerabilities cost the global economy $6 trillion annually, yet most developers lack the time and expertise to conduct thorough security audits. Traditional security tools generate overwhelming false positives and require manual interpretation. We asked ourselves: What if an AI could autonomously scan code, understand context like a human security researcher, and generate production-ready fixes?
CodeGuardian was born from this vision - an autonomous AI agent that doesn't just find vulnerabilities, but understands exploit chains, simulates attacks, and proposes intelligent fixes. Unlike static analyzers that pattern-match, CodeGuardian uses Gemini 3 Flash and Pro to reason about code semantics, understand business logic flaws, and even engage in adversarial "Red Team vs Blue Team" battles to validate security posture.
What it does
CodeGuardian is an autonomous security audit agent that combines cutting-edge AI reasoning with practical DevSecOps workflow:
Intelligent Code Analysis
- Multi-language support (Python, JavaScript, Java, PHP, Go, and more)
- Context-aware vulnerability detection using Gemini's 2M token context window
- Detects OWASP Top 10, CWE Top 25, and novel attack patterns
- Understands business logic flaws that traditional tools miss
Red Team vs Blue Team Battle Mode
- Red Team AI: Actively tries to exploit discovered vulnerabilities
- Blue Team AI: Proposes and validates security fixes
- Simulates real attack chains (e.g., SQL Injection → RCE → Privilege Escalation)
- Generates exploit proof-of-concepts in sandboxed Docker containers
Autonomous Fix Generation
- AI-generated secure code replacements
- Automated testing to ensure fixes don't break functionality
- Diff previews with detailed explanations
- One-click application of vetted patches
Interactive Web Dashboard
- Real-time scan progress with WebSocket streaming
- Beautiful vulnerability visualizations and attack flow diagrams
- Integrated terminal for CLI access
- AI chatbot for security mentoring
- Downloadable HTML/JSON reports with OWASP/CWE mappings
Compliance Mapping
- Automatic mapping to PCI-DSS, SOC2, HIPAA, GDPR
- Risk scoring with actionable remediation steps
- Executive summaries for non-technical stakeholders
How we built it
Architecture
┌────────────────────────────────────────────────────────────┐
│ Web Dashboard (Node.js) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ File Scanner │ │ AI Chatbot │ │ Live Reports │ │
│ │ (Express.js) │ │ (Socket.IO) │ │ (HTML/JSON) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────┬──────────────────────────────────┘
│ REST API / WebSocket
┌─────────────────────────▼──────────────────────────────────┐
│ Python Core Engine (src/) │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Gemini Client (Flash for speed, Pro for deep analysis) │
│ └──────────────────────────────────────────────────────┘ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Marathon │ │ Vulnerability│ │ Attack Chain │ │
│ │ Agent │ │ Scanner │ │ Detector │ │
│ │ (Autonomy) │ │ (Detection) │ │ (Reasoning) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Fix │ │ Report │ │ Attack │ │
│ │ Generator │ │ Engine │ │ Simulator │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└────────────────────────────────────────────────────────────┘
Key Technologies
Backend (Python):
- Google Gemini 3 Flash & Pro: For fast scanning (Flash) and deep reasoning (Pro)
- Marathon Agent Pattern: Long-running autonomous agent with self-correction
- Rich Console: Beautiful terminal UI with progress tracking
- Docker SDK: Sandboxed attack simulation (when available)
- Bandit, Safety: Static analysis tool integration
Frontend (Node.js):
- Express.js: RESTful API server
- Socket.IO: Real-time bidirectional streaming
- Xterm.js: Embedded terminal emulator
- Multer: File upload handling with extension preservation
- Vanilla JS: No framework overhead, pure performance
AI Integration:
- Prompt Engineering: Sophisticated multi-stage prompts for vulnerability reasoning
- Token Optimization: Efficient chunking for 2M context window utilization
- Chain-of-Thought: Step-by-step reasoning for complex attack chains
- Few-Shot Learning: Example-guided vulnerability detection
Development Process
- Phase 1 - Core Engine: Built the Python scanner with Gemini integration, focusing on SQL injection and XSS detection
- Phase 2 - Adversarial Battle: Implemented Red/Blue team AI agents with sandboxed attack simulation
- Phase 3 - Web Dashboard: Created responsive UI with real-time updates and chat interface
- Phase 4 - Multi-Language: Extended support to Java, PHP, Go with language-specific patterns
- Phase 5 - Polish: Added 95+ example vulnerabilities, comprehensive documentation, and error handling
Challenges we ran into
1. Context Window Management
Problem: Large codebases exceeded even Gemini's 2M token limit.
Solution: Implemented intelligent code chunking with dependency-aware splitting. Files are ranked by "security criticality" (user input handlers, database queries, authentication logic) and scanned first. We use embeddings to maintain context across chunks.
2. False Positive Reduction
Problem: Generic pattern matching flagged too many non-exploitable issues.
Solution: Introduced two-pass scanning:
- Pass 1 (Gemini Flash): Fast heuristic detection
- Pass 2 (Gemini Pro): Deep semantic analysis with exploit verification This reduced false positives by 73% while increasing true positive detection.
3. Attack Simulation Safety
Problem: Running actual exploits risks damaging the host system.
Solution: Containerized attack simulation using Docker with resource limits, network isolation, and automatic cleanup. Fallback to "theoretical exploitation" when Docker is unavailable.
4. Real-Time Streaming
Problem: Python scanner couldn't stream progress to Node.js dashboard.
Solution: Implemented hybrid streaming:
- stdout/stderr capture from Python subprocess
- Socket.IO emit from Node.js to browser
- Progress parsing with regex to extract percentage/status
5. Windows Path Handling
Problem: Paths with spaces (e.g., "New folder") broke spawn() commands.
Solution: Switched from shell: true to shell: false with proper array argument passing, and normalized paths with forward slashes.
Accomplishments that we're proud of
✅ 95+ Real Vulnerabilities Detected across 5 programming languages
✅ Adversarial AI Battle System - First-of-its-kind Red/Blue team competition
✅ Zero-Configuration Setup - Works out-of-the-box with just GEMINI_API_KEY
✅ Beautiful Reports - Interactive HTML with attack flow diagrams
✅ Integrated Terminal - Full CLI access directly in the web dashboard
✅ Sub-2-Second Response for Gemini Flash queries in chat mode
✅ Production-Ready Fixes - Auto-generated patches that actually work
✅ Comprehensive Documentation - 7 markdown guides, 200+ code comments
Most Proud Moment: When the Red Team AI discovered a 3-stage attack chain (XSS → Session Hijacking → Admin Panel Access) that we hadn't manually coded into the examples. The AI reasoned about the exploit path autonomously!
What we learned
Technical Learnings
- Gemini's reasoning is incredible: It genuinely understands code semantics, not just syntax
- Prompt engineering is an art: Small wording changes drastically affect output quality
- Autonomous agents need guardrails: Marathon agent required retry logic and validation checks
- Real-time UX matters: Users want to see progress, not just final results
Security Insights
- Attack chains are underrated: Most tools find individual bugs but miss how they combine
- Context is everything: The same code pattern is safe in one context, vulnerable in another
- Developers want learning: People loved the AI chatbot that explains why code is insecure
Project Management
- Documentation drives adoption: Our quickstart guide led to instant user success
- Iterate on feedback: Early testers helped identify the path-with-spaces bug
- Examples matter: The 95-vulnerability test suite became our best feature demo
What's next for CodeGuardian
Short-Term (Next 2 Months)
- IDE Extensions: VS Code plugin for real-time scanning as you type
- CI/CD Integration: GitHub Actions, GitLab CI pipelines
- Mobile Dashboard: React Native app for on-the-go monitoring
- API Versioning: Stable REST API for third-party integrations
Medium-Term (6 Months)
- Fine-Tuned Model: Train specialized Gemini model on CVE database
- Enterprise Features: RBAC, SSO, team collaboration
- Trend Analysis: Track security improvements over time
- Bug Bounty Integration: Auto-submit to HackerOne/Bugcrowd
Long-Term Vision
- Open Source Community: Public vulnerability pattern database
- Educational Platform: Interactive security training with AI tutoring
- Certification: "CodeGuardian Verified" badge for secure repos
- Agentic Autonomy: Self-improving AI that learns from every scan
Ultimate Goal: Make security accessible to every developer, regardless of expertise level. Today, only 5% of code gets professional security reviews. With CodeGuardian, we want to make it 100%.
Built With
- bandit
- beautifulsoup4
- beautifulsoup4-|-tools:-docker
- css3
- css3-grid/flexbox-|-python-libraries:-rich
- docker
- dotenv
- dotenv-|-apis:-google-ai-studio
- express.js
- font-awesome
- git
- google-ai-studio-api
- google-gemini-3-flash-&-pro
- google-generativeai
- google-generativeai-sdk-|-frontend:-xterm.js
- javascript-(vanilla)-|-frameworks:-express.js
- node.js-16+
- powershell
- python-3.9+
- rich
- socket.io
- socket.io-|-ai:-google-gemini-3-flash-&-pro
- vanilla-javascript
- websocket
- xterm.js
Log in or sign up for Devpost to join the conversation.