SentinelLLM brings deep observability, security, and cost visibility to LLM applications using Gemini and Datadog APM.

🚀 SentinelLLM — Project Story

🧠 Inspiration

Large Language Models are rapidly becoming core infrastructure in modern applications — from copilots to autonomous agents.
Yet, while model capabilities have evolved dramatically, observability has not.

Today, when an LLM misbehaves, becomes slow, costs spike, or a malicious prompt slips through, engineers are often blind. Traditional monitoring tools show CPU or memory usage, but nothing about prompts, tokens, or model behavior.

We built SentinelLLM because we believe:

If LLMs are production systems, they deserve production-grade observability.

🔍 What it does

SentinelLLM is a production-ready gateway that sits between users and an LLM (Gemini via Vertex AI) and provides deep, LLM-aware observability and security using Datadog APM.

It enables teams to:

Observe every LLM request end-to-end
Measure latency, tokens, and cost
Detect prompt injection and PII risks
Debug failures with full distributed traces
Turn LLM behavior from a black box into actionable signals

All without modifying application business logic.

🛠️ How we built it

SentinelLLM is built as a FastAPI gateway with production-grade instrumentation.

Architecture Overview

Client ↓ SentinelLLM Gateway (FastAPI) ├─ Security analysis (prompt injection, PII) ├─ Gemini 2.0 inference (Vertex AI) ├─ Token & latency measurement └─ Datadog APM instrumentation ↓ Datadog Traces & Metrics

Key Technologies

Google Cloud Vertex AI — Gemini 2.0 models
Datadog APM — Host-based APM with real traces
FastAPI — High-performance Python backend
ddtrace-run — Automatic APM instrumentation
Docker — Local Datadog Agent for telemetry ingestion

Every /generate request is traced end-to-end, allowing engineers to inspect exactly how the model behaved for each prompt.

⚔️ Challenges we ran into

1️⃣ LLMs don’t fit traditional monitoring

Standard metrics don’t capture:

Prompt complexity
Token consumption
Model-specific latency

We had to design LLM-native telemetry signals instead of generic infrastructure metrics.

2️⃣ Gemini 2.0 API changes

Gemini 2.0 requires structured input formats, unlike earlier versions.
Plain string prompts fail silently.

We refactored our inference layer to use structured content payloads, ensuring compatibility with the latest models.

3️⃣ Observability boundaries were non-obvious

We initially gated telemetry inside the application using API keys — which is incorrect for OTLP-based systems.

We learned the agent is the security boundary, not the app.
Fixing this unlocked seamless Datadog trace ingestion.

🏆 Accomplishments that we're proud of

✅ Real Gemini 2.0 inference (no mocks)
✅ Live Datadog APM traces
✅ Host-based instrumentation
✅ End-to-end request visibility
✅ Production-grade failure handling
✅ Security-first design mindset

Most importantly, SentinelLLM is not a demo toy — it behaves like real infrastructure.

📚 What we learned

Observability for AI systems must be model-aware
LLM cost and latency are first-class production concerns
Datadog APM is powerful when used beyond basic metrics
AI systems require the same rigor as distributed microservices
Good observability changes how teams design systems

🔮 What's next for SentinelLLM

SentinelLLM is just the beginning.

Next steps include:

🔍 Advanced prompt anomaly detection
💰 Cost-based alerting and budgets
🔁 Multi-model support (Claude, GPT, open-source LLMs)
📊 Custom Datadog dashboards for AI teams
🧠 Automatic incident summaries for LLM failures

Our vision is to make LLM observability a default, not an afterthought.

🧭 Final Thought

You can’t secure what you can’t observe.
You can’t scale what you can’t understand.

SentinelLLM makes LLMs observable, secure, and production-ready.

Built With

curl
datadog-apm
datadog-apm-api
datadoghq-metrics-explorer
ddtrace
docker
fastapi-rest-api
git
github
google-cloud
host-based-instrumentation
iam-and-service-accounts
opentelemetry
pydantic
python
uvicorn
vertex-ai
vertex-ai-sdk
vscode

Updates

Tanush Jain started this project — Dec 31, 2025 10:22 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.