AegisAI – Project Story

🎯 Inspiration

The inspiration for AegisAI came from a simple yet critical question:

When an LLM application breaks in production, how do you debug it

Unlike traditional software where you can trace exact code paths, LLMs are probabilistic black boxes.
A prompt injection attack could slip through undetected.
A subtle prompt change could spike API costs by ten times overnight.
A model hallucination could leak sensitive data.
And worst of all, you might not even know it happened.

While DevOps solved observability for traditional applications using logs, metrics, and traces, AI observability is still in the dark ages.
Companies are deploying billion dollar LLM applications with the same visibility as driving a car blindfolded.

That led to one idea:

What if LLMs had a black box flight recorder

Something that not only detects failures, but explains why they happened and how to fix them.

That idea became AegisAI.

💡 What It Does

AegisAI is a production grade AI security and observability platform that acts as a protective shield around LLM applications.

Think Datadog meets AI Security.

Core Features

🚨 Real Time Threat Detection

Detects sixteen types of prompt injection attacks
Covers system extraction, role injection, and bypass attempts
Assigns severity levels with high confidence scoring
Automatically creates Datadog incidents with full context

🧠 AI Powered Analysis

Autopsy reports explaining failures in plain English
Automatic prompt fix suggestions with side by side comparisons
Executive summaries for non technical stakeholders
One click replay to validate fixes

📊 Full Stack Observability

Custom metrics for latency, token cost, risk level, and error rates
Frontend browser logs with prompt and response visibility
Automated monitors for security, latency, and cost anomalies
Datadog incident management as a centralized command center

🎨 Modern User Interface

Chat style interface similar to ChatGPT
Dark mode with glassmorphism effects
Real time incident notifications
Smooth animations and responsive design

🛠️ How We Built It

Tech Stack

Frontend

Next.js
React
Tailwind CSS
Datadog Browser Logs SDK

Backend

Node.js
Express
Google Vertex AI with Gemini Flash
Datadog APIs for logs, metrics, and incidents
hot shots StatsD client
Application Default Credentials for secure authentication

🔍 Observability Setup

Datadog Organization: AegisAI
Datadog Site: us3.datadoghq.com
Five custom metrics tracking requests, latency, tokens, errors, and risk
Three automated monitors for security, latency, and cost spikes
Automatic incident creation with forensic context

Data Flow

User submits prompt
Detection engine evaluates security risk
Safe prompts go to Gemini
Metrics stream to Datadog in real time
Logs capture prompt and response pairs
Malicious prompts create incidents automatically
Response returns with metadata

🔑 Key Implementation Details

Detection Engine

Sixteen detection patterns aligned with OWASP LLM risks
Confidence scoring per pattern
Severity classification based on impact
Covers instruction override, data exfiltration, and role abuse

Metrics Pipeline

StatsD fire and forget metrics
No performance overhead
Tagged metrics for filtering by severity and malicious state
Latency and token usage as gauges
Requests and errors as counters

Incident Automation

Uses Datadog Incidents API
Populates severity, impact, and prompt context
Returns incident URL for immediate triage

Traffic Generator

Four phase testing suite
Normal traffic
Malicious attacks
Token spike scenarios
Concurrent load tests
Generates over fifty realistic requests

🏆 What Makes It Special

Most hackathon projects rely on mock data or fake responses.

AegisAI is real.

Live Datadog metrics and incidents
Real Gemini API calls via Vertex AI
Real security detection logic
Real observability signals
Real traffic generator
Polished production grade UI

You can clone the repository, run one command, and watch real incidents appear.

📚 What We Learned

Datadog as a Platform

StatsD enables near zero latency metrics
Metric tagging unlocks powerful filtering
Incident APIs provide better context than chat alerts
Anomaly detection is critical for cost control

LLM Observability is Different

Tokens equal money
Latency is probabilistic
Security is contextual
Debugging requires AI to explain AI

Application Default Credentials

No API keys in code
No secret rotation headaches
Enterprise grade authentication

🚧 Challenges We Faced

Datadog incident API field formatting
Missing Datadog agent for StatsD
False negatives in prompt detection
Gemini rate limits during load tests
Browser log configuration pitfalls
UI polish for judge impact

Each challenge resulted in a stronger, more production ready system.

🎯 Accomplishments

Fully working end to end system
Enterprise grade security practices
Comprehensive automated testing
Production ready observability
Polished professional interface

This is not a demo.

It is a blueprint for how AI systems should be operated.

🚀 What’s Next

Short Term

Machine learning based detection
Slack and PagerDuty integrations
Rate limiting and quotas
Multi model support

Long Term

AegisAI Cloud SaaS
Browser based AI red teaming
Open source detection community
Enterprise compliance tooling

🙏 Built For

Datadog and Google Cloud Hackathon 2025

Submitted by
Harmanpreet Singh

Organization
AegisAI

Datadog Site
us3.datadoghq.com

🔗 Links

GitHub: https://github.com/sicaario/AegisAI
Datadog Dashboard: https://us3.datadoghq.com/dashboard/lists
Documentation: README.md and DATADOG_OBSERVABILITY.md

TLDR

AegisAI is a production ready LLM security and observability platform that detects, explains, and fixes AI failures in real time using Datadog and Gemini.

This is not a demo.
This is how AI should be monitored in production.

Built With

Updates

Private user started this project — Dec 25, 2025 12:02 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.