Inspiration 💡

As AI applications become ubiquitous, we noticed a critical gap: LLMs are incredibly vulnerable to attacks, yet most security solutions are reactive rather than proactive. Traditional approaches like OpenAI's moderation API run after the LLM processes the prompt, meaning the attack has already happened. We asked ourselves: "What if we could stop attacks before they reach the AI?"

The rise of prompt injection attacks, jailbreaks, and data exfiltration attempts showed us that content moderation alone isn't enough. We needed something that understands both technical patterns (SQL injection syntax) and semantic intent (malicious sentiment). That's when we realized: Why not combine rule-based pattern matching with AI-powered context understanding?

What It Does 🛡️

DevShield AI is a production-ready LLM security monitoring system that protects AI applications from 8 categories of attacks:

  1. SQL Injection - Detects database manipulation attempts
  2. Code Injection - Blocks code execution exploits
  3. System Commands - Prevents OS-level attacks
  4. Prompt Injection - Stops AI behavior manipulation
  5. Data Exfiltration - Catches sensitive data extraction
  6. Malicious Instructions - Identifies harmful directives
  7. Jailbreak Attempts - Prevents safety bypass techniques
  8. Cost Attacks - Stops resource exhaustion

How it works:

  • Every user prompt is analyzed in real-time (150ms average)
  • Hybrid detection: 538 attack patterns (70% weight) + Vertex AI sentiment/entity analysis (30% weight)
  • Risk score calculated (0-100)
  • Prompts ≥76 are blocked immediately - never reaching the LLM
  • Safe prompts are forwarded to Gemini 2.0 for response generation
  • All events logged and monitored via Datadog APM

Key Innovation: We analyze before the LLM processes the prompt, preventing attacks rather than just detecting them.

How We Built It 🔨

Architecture:

User Request → Security Analysis (Patterns + Vertex AI) → Rate Limiting
                    ↓                                           ↓
            Risk Score Calculated                      Tier Assignment
                    ↓                                           ↓
              Decision: Risk ≥76?                               ↓
        ┌───────────┴───────────┐                              ↓
        ↓                       ↓                               ↓
    Block (25ms)            Allow                               ↓
        ↓                       ↓                               ↓
    Attack Log            Gemini Response ←────────────────────┘
    (immediate)               (~850ms)
        ↓                       ↓
    User sees blocked     Safe Response
        ↓                       ↓
    [Background]          User Sees Result
    AI Explanation
    (850ms later)
        ↓
    Admin Dashboard
    (Datadog Integration)

    ══════════════════════════════════════════════════
    ║ Datadog APM - Continuous Monitoring (All Steps) ║
    ══════════════════════════════════════════════════

Phase 1: Pattern Matching Engine (Week 1)

  • Researched and cataloged 538 attack patterns across 8 categories
  • Implemented regex-based pattern matching with confidence scoring
  • Optimized for <15ms analysis time using pattern caching
  • Created risk scoring algorithm with weighted categories (9-15 points each)

Phase 2: Vertex AI Integration (Week 1-2)

  • Integrated Google Cloud Natural Language API for sentiment analysis
  • Implemented entity detection to catch data exfiltration attempts
  • Built hybrid scoring system: 70% patterns + 30% AI context
  • Added educational context detection to reduce false positives

Phase 3: Rate Limiting & Attack Storage (Week 2)

  • Designed 3-tier adaptive rate limiting system (100/30/10 req/min)
  • Implemented in-memory attack storage with UUID generation
  • Added automatic tier assignment based on risk scores
  • Created user/IP blocking functionality

Phase 4: AI Explanation Engine (Week 2)

  • Integrated Gemini 2.0 Flash Experimental for attack analysis
  • Built async explanation pipeline (runs in background)
  • Designed structured prompts for consistent output
  • Added graceful fallback for API failures

Phase 5: Frontend & Admin Portal (Week 2-3)

  • Built dark-themed chat interface with real-time risk badges
  • Created admin dashboard with live metrics and charts
  • Implemented attack log with detailed modal views
  • Added user management (block/unblock functionality)
  • Designed responsive UI with Tailwind CSS

Phase 6: Observability & Production (Week 3)

  • Integrated Datadog APM with dd-trace instrumentation
  • Created 16 custom metrics (risk scores, attack counts, rate limits)
  • Set up traces for performance monitoring
  • Configured production deployment on Google Cloud Run
  • Implemented JWT authentication for admin portal

Phase 7: Testing & Optimization (Week 3-4)

  • Created comprehensive test suite: 1,001 test cases
  • Tested against real-world attack examples
  • Achieved 95% accuracy (951/1001 passed)
  • Verified 0% false positives on legitimate prompts
  • Optimized Docker build for fast deployments

Technologies Used:

Frontend:

  • Next.js 14 (App Router with React Server Components)
  • React 18 with TypeScript 5
  • Tailwind CSS 4 for styling
  • Server-Sent Events (SSE) for streaming

Backend:

  • Next.js API Routes (serverless functions)
  • Node.js 18 runtime
  • In-memory data structures (Maps for rate limiting & attack storage)

AI/ML:

  • Vertex AI Natural Language API - Sentiment analysis & entity detection
  • Gemini 2.0 Flash Experimental - Response generation & attack explanation
  • Google Generative AI SDK

Observability:

  • Datadog APM - Application performance monitoring
  • dd-trace - Node.js instrumentation
  • Custom metrics, traces, and logs

Cloud Infrastructure:

  • Google Cloud Run - Serverless container deployment
  • Google Container Registry - Docker image storage
  • Google Secret Manager - API key management
  • Docker (multi-stage builds with Alpine Linux)

Authentication & Security:

  • JWT (JSON Web Tokens) via jose library
  • HttpOnly cookies for session management
  • Rate limiting per IP/User

Challenges We Faced 🏔️

Challenge 1: Balancing Speed vs. Accuracy

  • Problem: Pattern matching is fast (12ms) but misses context. AI is smart but slow (850ms).
  • Solution: Hybrid approach with weighted scoring (70% patterns, 30% AI). Patterns provide precision, AI provides context.
  • Result: 150ms total analysis time with 95% accuracy.

Challenge 2: False Positives on Educational Queries

  • Problem: Initial system blocked "How do I prevent SQL injection in my code?" (educational)
  • Solution: Implemented educational context detection (keywords: "example", "learn", "prevent", "tutorial")
  • Algorithm: Reduce risk by 85% when educational intent detected
  • Result: 0% false positives on 121 legitimate test cases

Challenge 3: Real-Time Analysis at Scale

  • Problem: Can't wait 850ms for Gemini on every request
  • Solution:
    • For blocked prompts: Return response immediately (25ms), explain attack in background
    • For safe prompts: Run Vertex AI in parallel with pattern matching
    • For responses: Stream Gemini output chunk-by-chunk
  • Result: User sees blocking in 25ms, safe responses start in 150ms

Challenge 4: Attack Pattern Maintenance

  • Problem: Attack techniques evolve constantly
  • Solution:
    • Modular pattern system (easy to add new patterns)
    • Vertex AI catches semantic attacks that patterns miss
    • AI explanation helps admins identify new patterns
    • Currently 538 patterns, easily expandable
  • Result: System adapts to new threats without code changes

Challenge 5: Datadog Integration Complexity

  • Problem: Next.js 14 App Router has limited Datadog documentation
  • Solution:
    • Used dd-trace manual instrumentation
    • Created custom metrics with proper tagging
    • Wrapped API routes with trace spans
    • Added instrumentation hook in instrumentation.ts
  • Result: 16 custom metrics, full APM coverage, sub-millisecond overhead

Challenge 6: Docker Build Size & Speed

  • Problem: Initial Docker image was 1.2GB, took 8 minutes to build
  • Solution:
    • Multi-stage builds (deps → builder → runner)
    • Alpine Linux base image
    • Next.js standalone output mode
    • Layer caching optimization
  • Result: 350MB final image, 2-minute builds, 8-second cold starts

Challenge 7: Preventing LLM Response Leaks

  • Problem: Even with blocked prompts, need to ensure LLM doesn't accidentally leak data
  • Solution:
    • Built detectPIIInResponse() using Vertex AI entity detection
    • Can scan LLM responses for accidental PII exposure
    • Currently runs on-demand, could be made automatic
  • Status: Implemented but not in main flow (future enhancement)

Challenge 8: Rate Limiting Fairness

  • Problem: How to throttle attackers without impacting normal users?
  • Solution:
    • Three-tier system based on risk level:
    • Tier 1 (Low risk): 100 req/min
    • Tier 2 (Medium): 30 req/min
    • Tier 3 (High): 10 req/min
    • Users automatically move between tiers based on behavior
    • Attackers get 10x slower service, normal users unaffected
  • Result: Effective DoS prevention without impacting UX

What We Learned 🎓

Technical Learnings:

  1. Hybrid AI Systems Work Better - Combining rule-based patterns with AI context understanding gave us higher accuracy than either alone. Patterns catch technical exploits, AI catches semantic intent.

  2. Performance Matters in Security - Users won't tolerate slow security checks. We optimized every millisecond: pattern caching, parallel API calls, async processing.

  3. Observability is Critical - Datadog integration was essential for understanding system behavior. Without metrics, we were flying blind.

  4. Graceful Degradation - When Vertex AI fails, fall back to pattern-only. When Gemini fails, return generic explanation. System keeps working.

  5. Testing is Security - Our 1,001 test cases caught edge cases we never would have found manually. Automated testing is non-negotiable.

AI/ML Learnings:

  1. Vertex AI Natural Language is Powerful - Sentiment + entity detection caught attacks that pure pattern matching missed. The combination is greater than the sum.

  2. Gemini 2.0 Flash is Fast - At 850ms average, it's the fastest model we tested. Perfect for real-time explanations.

  3. Prompt Engineering Matters - Our attack explanation prompts went through 5 iterations. Structured output format was key to consistency.

  4. AI Confidence Scoring - We learned to calculate confidence from sentiment magnitude, not just score. High magnitude = high confidence.

Architecture Learnings:

  1. Serverless Scales Naturally - Cloud Run's auto-scaling handled load spikes without configuration. Stateless design was crucial.

  2. In-Memory > Database for Speed - Attack storage in memory (Maps) is 100x faster than database queries. Perfect for real-time systems.

  3. Docker Multi-Stage Builds - Reduced image size by 72% and deployment time by 60%. Production-critical.

Security Learnings:

  1. Defense in Depth - Multiple layers: patterns, AI, rate limiting, JWT auth, IP blocking. No single point of failure.

  2. Block Before Processing - Analyzing before the LLM sees the prompt is fundamentally more secure than after.

  3. Educational Context is Complex - Distinguishing "How do I hack?" (educational) from "How do I hack?" (malicious) required AI context understanding.

Accomplishments 🏆

95% Detection Accuracy - Tested with 1,001 real-world examples
0% False Positives - Never blocked a legitimate prompt
150ms Analysis Time - Fast enough for real-time applications
538 Attack Patterns - Comprehensive coverage across 8 categories
Hybrid AI System - First LLM security tool combining patterns + Vertex AI + Gemini
Production-Ready - Deployed on Cloud Run with Datadog monitoring
3-Tier Rate Limiting - Adaptive throttling based on threat level
Full Admin Portal - Real-time dashboard with AI-powered explanations
Zero Downtime Deployment - Auto-scaling, health checks, graceful degradation
Open Source - MIT License, ready for community contributions

What's Next 🚀

Near-Term Enhancements:

  1. Redis for Distributed Attack Storage - Share attack logs across Cloud Run instances
  2. PII Response Scanning - Automatically scan LLM responses for data leaks
  3. Custom Pattern Upload - Let admins add organization-specific attack patterns
  4. Webhook Notifications - Alert security teams via Slack/PagerDuty on critical attacks
  5. Historical Analytics - Long-term attack trend analysis and reporting

Future Vision:

  1. Multi-Model Support - Add support for Claude, GPT-4, Llama, etc.
  2. Plugin System - Allow third-party security rules and integrations
  3. Machine Learning Enhancement - Train custom models on attack patterns
  4. API Gateway Mode - Deploy as a reverse proxy for any LLM API
  5. Enterprise Features - Multi-tenant support, RBAC, SSO integration

Built With

Share this project:

Updates