Inspiration
As AI adoption accelerates, organizations are exposing themselves to entirely new categories of cyber threats. We realized that traditional security measures aren't enough to protect Large Language Models. We were inspired to build a solution that assumes a "zero-trust" environment—where we can safely and rigorously test LLMs against sophisticated attacks like prompt injections and data exfiltration before they ever hit production.
What it does
AstroVault is an Enterprise AI Safety Validation platform. Think of it as a highly secure, intelligent firewall for LLMs. It sits between the user and the AI, intercepting every prompt. Using an advanced threat engine, it analyzes requests in real-time to detect jailbreaks, prompt injections, or malicious intents. Depending on the threat score, it automatically allows, flags, blocks, or quarantines the request. Meanwhile, every single action is recorded in a tamper-proof cryptographic ledger for perfect auditability.
How we built it
We designed a microservice architecture built for maximum isolation.
Frontend & Auth: Next.js and Clerk (SSO). Isolation: Docker's internal bridge networking ensures our adversarial "Red Team" is completely severed from the "Blue Team" threat classifier. Threat Engine: We use Google's Gemini 2.5 Flash API, strictly prompted to act as an un-jailbreakable security analyst, returning deterministic JSON threat scores. Reliability: We engineered a fallback "Heuristic Regex Engine" to ensure the system fails open if the network drops. Auditability: A custom Node.js service that chains cryptographically hashed logs together, much like a blockchain, to ensure historical data cannot be altered.
Challenges we ran into
Deterministic AI: Forcing a generative LLM (Gemini) to consistently return a strict JSON schema for automated routing without hallucinating. Absolute Isolation: Configuring strict Docker networking rules to ensure our Red Team testing environment had absolutely zero routing access to the Blue Team classifier. Silent Tampering: Designing an audit system that didn't just store logs, but actively thwarted attackers trying to use tools to silently rewrite history on the disk.
Accomplishments that we're proud of
The Semantic Threat Engine: Successfully turning Gemini into a hyper-fast, zero-shot cybersecurity analyst that accurately catches 8 different complex threat vectors. Zero-Trust Architecture: Achieving true network-level isolation for our testing environments. The Cryptographic Ledger: Building a real-time verification engine that recalculates SHA-256 hashes on the fly to expose exactly where an audit log was tampered with.
What we learned
Prompt Engineering for Security: We gained deep insights into how to build robust "metaprompts" that prevent an LLM from being easily manipulated or jailbroken. Resilient Infrastructure: We learned the importance of having static fallbacks (our Regex engine) when relying on dynamic ML APIs. Cryptographic Auditing: We mastered the mathematics of append-only, tamper-evident data structures to guarantee system integrity.
What's next for AstroVault
Advanced Threat Vectors: Expanding our threat engine to detect complex, multi-turn jailbreaks and multimodal (image/audio) injection attacks. Enterprise Analytics: Building richer, real-time dashboards for security teams to visualize threat patterns and attacker behavior. Managed Cloud Deployment: Transitioning the architecture to support scalable, managed deployments for organizations looking for plug-and-play AI security.
Log in or sign up for Devpost to join the conversation.