Inspiration
We saw a gap. Everyone is rushing to deploy GenAI, but few are thinking about the "Day 2" problems: What if my bot gets tricked into revealing secrets? What if a loop drains my bank account? What if it hallucinates medical advice?
Traditional firewalls protect servers; they don't understand language. We wanted to build a firewall that speaks "LLM"—one that sits between the user and the model, understanding the intent of a prompt and the safety of a response before it ever reaches the end user. Clestiq Shield was born from the desire to make GenAI safe for production, not just cool for demos.
What it does
Clestiq Shield is a comprehensive AI Firewall and Observability Platform. It acts as a middleware proxy that intercepts every request and response to your LLM.
- Blocks Attacks: It detects prompt injections, jailbreaks, and malicious inputs in real-time.
- Controls Costs: It acts as a "circuit breaker" for your budget, preventing token spikes and DDoS attacks.
- Ensures Quality: It inspects model outputs for hallucinations, toxicity, and unauthorized tone.
- Observability First: It doesn't just block; it explains why. We integrated deeply with Datadog to visualize every token, cost, and threat.
How we built it
We built a microservices architecture to keep things fast and modular:
- The Core: Built with Python and FastAPI for high-throughput async processing.
- The Brains: We use LangGraph to orchestrate our security agents. We have specialized agents: Sentinel (pattern matching & rapid heuristics) and Guardian (a parallel LLM that "grades" the primary model's output).
- The Data Layer: PostgreSQL for structured data and Redis for lightning-fast rate limiting and caching.
- The Eyes: We went all-in on Datadog. We use custom metrics, APM tracing, and log management to build a "Mission Control" dashboard that tracks business health (API usage), security signals (attacks blocked), and LLM economics (cost per user).
Challenges we ran into
- The Latency vs. Security Trade-off: Running a second LLM to check the first one adds time. We had to optimize heavily, using parallel processing and smarter, smaller models for the "Guardian" checks to keep the user experience snappy.
- Defining "Bad": It's easy to block a SQL injection code snippet. It's much harder to block a user persuading the AI to be mean. Tuning our detection rules to minimize false positives was a constant iteration.
- Observability Noise: Initially, we logged everything. It was a mess. We had to learn how to structure structured logs and span tags so that a Datadog dashboard could instantly tell the difference between a "technical error" and a "security incident."
Accomplishments that we're proud of
- Real-time Intervention: Watching the system live-block a jailbreak attempt while passing legitimate users is incredibly satisfying.
- The Dashboard: Our Datadog dashboard isn't just pretty charts; it's an actionable command center. We can see a spike in "P1 Critical" alerts and drill down to the specific user and prompt in seconds.
- Seamless Integration: We built it as a drop-in proxy. You change your base URL, and suddenly your existing app handles security and comprehensive logging without rewriting your client code.
What we learned
- Security is a User Experience: If security is too aggressive, the app feels broken. If it's too lax, it's useless. The "sweet spot" requires constant tuning.
- LLMs need specialized monitoring: CPU usage doesn't tell you if your model is hallucinating. You need "semantic monitoring"—tracking topics, sentiment, and refusal rates.
What's next for Clestiq Shield
- Custom Policy Engine: allowing non-technical users to define rules like "No talk about competitors" in plain English.
- Multi-Model Support: Expanding beyond our current providers to support any OpenAI-compatible endpoint.
- Automated Red Teaming: An agent that constantly tries to hack your own app to find weaknesses before attackers do.
Built With
- datadog
- docker
- fastapi
- gcp
- gemini
- kubernetes
- langchain
- langgraph
- next.js
- postgresql
- redis
- tailwind
Log in or sign up for Devpost to join the conversation.