DevShield AI

Datadog Dashboard
Attack Details
Initial Chat Page
Blocked Users
Chat Page with Prompts
Admin Portal Login
Admin Dashboard
All Attacks

Inspiration 💡

As AI applications become ubiquitous, we noticed a critical gap: LLMs are incredibly vulnerable to attacks, yet most security solutions are reactive rather than proactive. Traditional approaches like OpenAI's moderation API run after the LLM processes the prompt, meaning the attack has already happened. We asked ourselves: "What if we could stop attacks before they reach the AI?"

The rise of prompt injection attacks, jailbreaks, and data exfiltration attempts showed us that content moderation alone isn't enough. We needed something that understands both technical patterns (SQL injection syntax) and semantic intent (malicious sentiment). That's when we realized: Why not combine rule-based pattern matching with AI-powered context understanding?

What It Does 🛡️

DevShield AI is a production-ready LLM security monitoring system that protects AI applications from 8 categories of attacks:

SQL Injection - Detects database manipulation attempts
Code Injection - Blocks code execution exploits
System Commands - Prevents OS-level attacks
Prompt Injection - Stops AI behavior manipulation
Data Exfiltration - Catches sensitive data extraction
Malicious Instructions - Identifies harmful directives
Jailbreak Attempts - Prevents safety bypass techniques
Cost Attacks - Stops resource exhaustion

How it works:

Every user prompt is analyzed in real-time (150ms average)
Hybrid detection: 538 attack patterns (70% weight) + Vertex AI sentiment/entity analysis (30% weight)
Risk score calculated (0-100)
Prompts ≥76 are blocked immediately - never reaching the LLM
Safe prompts are forwarded to Gemini 2.0 for response generation
All events logged and monitored via Datadog APM

Key Innovation: We analyze before the LLM processes the prompt, preventing attacks rather than just detecting them.

How We Built It 🔨

Architecture:

User Request → Security Analysis (Patterns + Vertex AI) → Rate Limiting
                    ↓                                           ↓
            Risk Score Calculated                      Tier Assignment
                    ↓                                           ↓
              Decision: Risk ≥76?                               ↓
        ┌───────────┴───────────┐                              ↓
        ↓                       ↓                               ↓
    Block (25ms)            Allow                               ↓
        ↓                       ↓                               ↓
    Attack Log            Gemini Response ←────────────────────┘
    (immediate)               (~850ms)
        ↓                       ↓
    User sees blocked     Safe Response
        ↓                       ↓
    [Background]          User Sees Result
    AI Explanation
    (850ms later)
        ↓
    Admin Dashboard
    (Datadog Integration)

    ══════════════════════════════════════════════════
    ║ Datadog APM - Continuous Monitoring (All Steps) ║
    ══════════════════════════════════════════════════

Phase 1: Pattern Matching Engine (Week 1)

Researched and cataloged 538 attack patterns across 8 categories
Implemented regex-based pattern matching with confidence scoring
Optimized for <15ms analysis time using pattern caching
Created risk scoring algorithm with weighted categories (9-15 points each)

Phase 2: Vertex AI Integration (Week 1-2)

Integrated Google Cloud Natural Language API for sentiment analysis
Implemented entity detection to catch data exfiltration attempts
Built hybrid scoring system: 70% patterns + 30% AI context
Added educational context detection to reduce false positives

Phase 3: Rate Limiting & Attack Storage (Week 2)

Designed 3-tier adaptive rate limiting system (100/30/10 req/min)
Implemented in-memory attack storage with UUID generation
Added automatic tier assignment based on risk scores
Created user/IP blocking functionality

Phase 4: AI Explanation Engine (Week 2)

Integrated Gemini 2.0 Flash Experimental for attack analysis
Built async explanation pipeline (runs in background)
Designed structured prompts for consistent output
Added graceful fallback for API failures

Phase 5: Frontend & Admin Portal (Week 2-3)

Built dark-themed chat interface with real-time risk badges
Created admin dashboard with live metrics and charts
Implemented attack log with detailed modal views
Added user management (block/unblock functionality)
Designed responsive UI with Tailwind CSS

Phase 6: Observability & Production (Week 3)

Integrated Datadog APM with dd-trace instrumentation
Created 16 custom metrics (risk scores, attack counts, rate limits)
Set up traces for performance monitoring
Configured production deployment on Google Cloud Run
Implemented JWT authentication for admin portal

Phase 7: Testing & Optimization (Week 3-4)

Created comprehensive test suite: 1,001 test cases
Tested against real-world attack examples
Achieved 95% accuracy (951/1001 passed)
Verified 0% false positives on legitimate prompts
Optimized Docker build for fast deployments

Technologies Used:

Frontend:

Next.js 14 (App Router with React Server Components)
React 18 with TypeScript 5
Tailwind CSS 4 for styling
Server-Sent Events (SSE) for streaming

Backend:

Next.js API Routes (serverless functions)
Node.js 18 runtime
In-memory data structures (Maps for rate limiting & attack storage)

AI/ML:

Vertex AI Natural Language API - Sentiment analysis & entity detection
Gemini 2.0 Flash Experimental - Response generation & attack explanation
Google Generative AI SDK

Observability:

Datadog APM - Application performance monitoring
dd-trace - Node.js instrumentation
Custom metrics, traces, and logs

Cloud Infrastructure:

Google Cloud Run - Serverless container deployment
Google Container Registry - Docker image storage
Google Secret Manager - API key management
Docker (multi-stage builds with Alpine Linux)

Authentication & Security:

JWT (JSON Web Tokens) via jose library
HttpOnly cookies for session management
Rate limiting per IP/User

Challenges We Faced 🏔️

Challenge 1: Balancing Speed vs. Accuracy

Problem: Pattern matching is fast (12ms) but misses context. AI is smart but slow (850ms).
Solution: Hybrid approach with weighted scoring (70% patterns, 30% AI). Patterns provide precision, AI provides context.
Result: 150ms total analysis time with 95% accuracy.

Challenge 2: False Positives on Educational Queries

Problem: Initial system blocked "How do I prevent SQL injection in my code?" (educational)
Solution: Implemented educational context detection (keywords: "example", "learn", "prevent", "tutorial")
Algorithm: Reduce risk by 85% when educational intent detected
Result: 0% false positives on 121 legitimate test cases

Challenge 3: Real-Time Analysis at Scale

Problem: Can't wait 850ms for Gemini on every request
Solution:
- For blocked prompts: Return response immediately (25ms), explain attack in background
- For safe prompts: Run Vertex AI in parallel with pattern matching
- For responses: Stream Gemini output chunk-by-chunk
Result: User sees blocking in 25ms, safe responses start in 150ms

Challenge 4: Attack Pattern Maintenance

Problem: Attack techniques evolve constantly
Solution:
- Modular pattern system (easy to add new patterns)
- Vertex AI catches semantic attacks that patterns miss
- AI explanation helps admins identify new patterns
- Currently 538 patterns, easily expandable
Result: System adapts to new threats without code changes

Challenge 5: Datadog Integration Complexity

Problem: Next.js 14 App Router has limited Datadog documentation
Solution:
- Used dd-trace manual instrumentation
- Created custom metrics with proper tagging
- Wrapped API routes with trace spans
- Added instrumentation hook in instrumentation.ts
Result: 16 custom metrics, full APM coverage, sub-millisecond overhead

Challenge 6: Docker Build Size & Speed

Problem: Initial Docker image was 1.2GB, took 8 minutes to build
Solution:
- Multi-stage builds (deps → builder → runner)
- Alpine Linux base image
- Next.js standalone output mode
- Layer caching optimization
Result: 350MB final image, 2-minute builds, 8-second cold starts

Challenge 7: Preventing LLM Response Leaks

Problem: Even with blocked prompts, need to ensure LLM doesn't accidentally leak data
Solution:
- Built detectPIIInResponse() using Vertex AI entity detection
- Can scan LLM responses for accidental PII exposure
- Currently runs on-demand, could be made automatic
Status: Implemented but not in main flow (future enhancement)

Challenge 8: Rate Limiting Fairness

Problem: How to throttle attackers without impacting normal users?
Solution:
- Three-tier system based on risk level:
- Tier 1 (Low risk): 100 req/min
- Tier 2 (Medium): 30 req/min
- Tier 3 (High): 10 req/min
- Users automatically move between tiers based on behavior
- Attackers get 10x slower service, normal users unaffected
Result: Effective DoS prevention without impacting UX

What We Learned 🎓

Technical Learnings:

Hybrid AI Systems Work Better - Combining rule-based patterns with AI context understanding gave us higher accuracy than either alone. Patterns catch technical exploits, AI catches semantic intent.
Performance Matters in Security - Users won't tolerate slow security checks. We optimized every millisecond: pattern caching, parallel API calls, async processing.
Observability is Critical - Datadog integration was essential for understanding system behavior. Without metrics, we were flying blind.
Graceful Degradation - When Vertex AI fails, fall back to pattern-only. When Gemini fails, return generic explanation. System keeps working.
Testing is Security - Our 1,001 test cases caught edge cases we never would have found manually. Automated testing is non-negotiable.

AI/ML Learnings:

Vertex AI Natural Language is Powerful - Sentiment + entity detection caught attacks that pure pattern matching missed. The combination is greater than the sum.
Gemini 2.0 Flash is Fast - At 850ms average, it's the fastest model we tested. Perfect for real-time explanations.
Prompt Engineering Matters - Our attack explanation prompts went through 5 iterations. Structured output format was key to consistency.
AI Confidence Scoring - We learned to calculate confidence from sentiment magnitude, not just score. High magnitude = high confidence.

Architecture Learnings:

Serverless Scales Naturally - Cloud Run's auto-scaling handled load spikes without configuration. Stateless design was crucial.
In-Memory > Database for Speed - Attack storage in memory (Maps) is 100x faster than database queries. Perfect for real-time systems.
Docker Multi-Stage Builds - Reduced image size by 72% and deployment time by 60%. Production-critical.

Security Learnings:

Defense in Depth - Multiple layers: patterns, AI, rate limiting, JWT auth, IP blocking. No single point of failure.
Block Before Processing - Analyzing before the LLM sees the prompt is fundamentally more secure than after.
Educational Context is Complex - Distinguishing "How do I hack?" (educational) from "How do I hack?" (malicious) required AI context understanding.

Accomplishments 🏆

✅ 95% Detection Accuracy - Tested with 1,001 real-world examples
✅ 0% False Positives - Never blocked a legitimate prompt
✅ 150ms Analysis Time - Fast enough for real-time applications
✅ 538 Attack Patterns - Comprehensive coverage across 8 categories
✅ Hybrid AI System - First LLM security tool combining patterns + Vertex AI + Gemini
✅ Production-Ready - Deployed on Cloud Run with Datadog monitoring
✅ 3-Tier Rate Limiting - Adaptive throttling based on threat level
✅ Full Admin Portal - Real-time dashboard with AI-powered explanations
✅ Zero Downtime Deployment - Auto-scaling, health checks, graceful degradation
✅ Open Source - MIT License, ready for community contributions

What's Next 🚀

Near-Term Enhancements:

Redis for Distributed Attack Storage - Share attack logs across Cloud Run instances
PII Response Scanning - Automatically scan LLM responses for data leaks
Custom Pattern Upload - Let admins add organization-specific attack patterns
Webhook Notifications - Alert security teams via Slack/PagerDuty on critical attacks
Historical Analytics - Long-term attack trend analysis and reporting

Future Vision:

Multi-Model Support - Add support for Claude, GPT-4, Llama, etc.
Plugin System - Allow third-party security rules and integrations
Machine Learning Enhancement - Train custom models on attack patterns
API Gateway Mode - Deploy as a reverse proxy for any LLM API
Enterprise Features - Multi-tenant support, RBAC, SSO integration

Built With

datadog
git/github
google-cloud
jwt
next.js
typescript
vertex-ai

Updates

Mohibkhan Pathan started this project — Dec 28, 2025 07:48 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.