AegisAI – Project Story
🎯 Inspiration
The inspiration for AegisAI came from a simple yet critical question:
When an LLM application breaks in production, how do you debug it
Unlike traditional software where you can trace exact code paths, LLMs are probabilistic black boxes.
A prompt injection attack could slip through undetected.
A subtle prompt change could spike API costs by ten times overnight.
A model hallucination could leak sensitive data.
And worst of all, you might not even know it happened.
While DevOps solved observability for traditional applications using logs, metrics, and traces, AI observability is still in the dark ages.
Companies are deploying billion dollar LLM applications with the same visibility as driving a car blindfolded.
That led to one idea:
What if LLMs had a black box flight recorder
Something that not only detects failures, but explains why they happened and how to fix them.
That idea became AegisAI.
💡 What It Does
AegisAI is a production grade AI security and observability platform that acts as a protective shield around LLM applications.
Think Datadog meets AI Security.
Core Features
🚨 Real Time Threat Detection
- Detects sixteen types of prompt injection attacks
- Covers system extraction, role injection, and bypass attempts
- Assigns severity levels with high confidence scoring
- Automatically creates Datadog incidents with full context
🧠 AI Powered Analysis
- Autopsy reports explaining failures in plain English
- Automatic prompt fix suggestions with side by side comparisons
- Executive summaries for non technical stakeholders
- One click replay to validate fixes
📊 Full Stack Observability
- Custom metrics for latency, token cost, risk level, and error rates
- Frontend browser logs with prompt and response visibility
- Automated monitors for security, latency, and cost anomalies
- Datadog incident management as a centralized command center
🎨 Modern User Interface
- Chat style interface similar to ChatGPT
- Dark mode with glassmorphism effects
- Real time incident notifications
- Smooth animations and responsive design
🛠️ How We Built It
Tech Stack
Frontend
- Next.js
- React
- Tailwind CSS
- Datadog Browser Logs SDK
Backend
- Node.js
- Express
- Google Vertex AI with Gemini Flash
- Datadog APIs for logs, metrics, and incidents
- hot shots StatsD client
- Application Default Credentials for secure authentication
🔍 Observability Setup
- Datadog Organization: AegisAI
- Datadog Site: us3.datadoghq.com
- Five custom metrics tracking requests, latency, tokens, errors, and risk
- Three automated monitors for security, latency, and cost spikes
- Automatic incident creation with forensic context
Data Flow
- User submits prompt
- Detection engine evaluates security risk
- Safe prompts go to Gemini
- Metrics stream to Datadog in real time
- Logs capture prompt and response pairs
- Malicious prompts create incidents automatically
- Response returns with metadata
🔑 Key Implementation Details
Detection Engine
- Sixteen detection patterns aligned with OWASP LLM risks
- Confidence scoring per pattern
- Severity classification based on impact
- Covers instruction override, data exfiltration, and role abuse
Metrics Pipeline
- StatsD fire and forget metrics
- No performance overhead
- Tagged metrics for filtering by severity and malicious state
- Latency and token usage as gauges
- Requests and errors as counters
Incident Automation
- Uses Datadog Incidents API
- Populates severity, impact, and prompt context
- Returns incident URL for immediate triage
Traffic Generator
- Four phase testing suite
- Normal traffic
- Malicious attacks
- Token spike scenarios
- Concurrent load tests
- Generates over fifty realistic requests
🏆 What Makes It Special
Most hackathon projects rely on mock data or fake responses.
AegisAI is real.
- Live Datadog metrics and incidents
- Real Gemini API calls via Vertex AI
- Real security detection logic
- Real observability signals
- Real traffic generator
- Polished production grade UI
You can clone the repository, run one command, and watch real incidents appear.
📚 What We Learned
Datadog as a Platform
- StatsD enables near zero latency metrics
- Metric tagging unlocks powerful filtering
- Incident APIs provide better context than chat alerts
- Anomaly detection is critical for cost control
LLM Observability is Different
- Tokens equal money
- Latency is probabilistic
- Security is contextual
- Debugging requires AI to explain AI
Application Default Credentials
- No API keys in code
- No secret rotation headaches
- Enterprise grade authentication
🚧 Challenges We Faced
- Datadog incident API field formatting
- Missing Datadog agent for StatsD
- False negatives in prompt detection
- Gemini rate limits during load tests
- Browser log configuration pitfalls
- UI polish for judge impact
Each challenge resulted in a stronger, more production ready system.
🎯 Accomplishments
- Fully working end to end system
- Enterprise grade security practices
- Comprehensive automated testing
- Production ready observability
- Polished professional interface
This is not a demo.
It is a blueprint for how AI systems should be operated.
🚀 What’s Next
Short Term
- Machine learning based detection
- Slack and PagerDuty integrations
- Rate limiting and quotas
- Multi model support
Long Term
- AegisAI Cloud SaaS
- Browser based AI red teaming
- Open source detection community
- Enterprise compliance tooling
🙏 Built For
Datadog and Google Cloud Hackathon 2025
Submitted by
Harmanpreet Singh
Organization
AegisAI
Datadog Site
us3.datadoghq.com
🔗 Links
- GitHub: https://github.com/sicaario/AegisAI
- Datadog Dashboard: https://us3.datadoghq.com/dashboard/lists
- Documentation: README.md and DATADOG_OBSERVABILITY.md
TLDR
AegisAI is a production ready LLM security and observability platform that detects, explains, and fixes AI failures in real time using Datadog and Gemini.
This is not a demo.
This is how AI should be monitored in production.
Log in or sign up for Devpost to join the conversation.