SecureRagAuditor

Inspiration

Security teams are drowning in log data. Existing tools either surface everything (information overload) or nothing (missed threats). I wanted to build a RAG system that doesn’t just retrieve logs — it enforces who can see what, blocks adversarial inputs, and leaves a full audit trail for compliance teams.

What it does

Secure RAG Auditor is a security intelligence system built on three layers:

Pre-Query Defense
A 5-category prompt injection detector using 19 regex patterns blocks adversarial queries before they reach the database or LLM.
Secure Retrieval
ChromaDB’s $lte metadata filter enforces attribute-based access control directly at the database layer, not in application code — meaning clearance checks cannot be bypassed.
Automated Governance
Every query is recorded in a SQLite audit ledger, including:
- who made the request
- clearance level
- detected risk
- whether the request was blocked

The LLM layer (GPT-4o-mini) analyzes retrieved logs and generates a structured Security Summary Report containing:

risk_level
key_findings
recommendation

How I built it

FastAPI for async API routing with modular architecture
ChromaDB as the embedded vector database with persistent storage
Pydantic for type-safe request and response validation
OpenAI GPT-4o-mini with a strict analyst system prompt and token budget management
SQLite audit ledger using Python’s built-in sqlite3

Challenges

The biggest challenge was enforcing access control at the correct layer. Filtering after retrieval in application code is insecure because bugs could expose restricted logs. By moving the $lte clearance filter directly into the ChromaDB query, the database never returns documents above the user’s clearance level.

Another challenge was building prompt injection detection that doesn’t over-block legitimate requests. Overly aggressive filtering creates false positives. The solution was categorizing attacks into 5 injection types and tuning thresholds independently for each category.

What I learned

Data-layer security is more reliable than application-layer filtering
Structured schemas (via Pydantic) make LLM outputs production-ready
Observability and audit trails are what separate prototypes from systems that compliance teams can actually trust

Built With

chromdb
fastapi
openai
pydantic
python
python-dotenv
sqlite

Updates

Suryaprakash Uppalapati started this project — May 13, 2026 01:30 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.