## Inspiration
As a Mechanical Engineer and Machinist, my career is built on the concept of "tolerances." In a CNC machine, if a measurement is off by a fraction of a millimetre, the part is scrapped. I applied this same "Zero Trust" mindset to AI. While building my Sovereign Archive—a decentralized knowledge mesh—I realized that as we give AI more agency over our data, we need a "digital calliper" to measure the safety of every input before it reaches the model.
## What it does
PromptGuard acts as a real-time security firewall for Large Language Models. It analyzes incoming text to detect Prompt Injections (like "Ignore previous instructions") or roleplay attacks.
- Detection: Uses a custom-trained Logistic Regression model for instant classification.
- Explanation: If a threat is detected, it utilizes Gemini 3.1 Flash-Lite to provide a concise, one-sentence explanation of the attack vector, helping users understand the risk.
## How we built it
The project is built on a high-performance stack designed for low latency and high accuracy:
- Backend: FastAPI running on Python 3.14.
- Machine Learning: A
scikit-learnpipeline usingTfidfVectorizerfor feature extraction and Logistic Regression for probability-based classification. - AI Integration: The latest
google-genaiSDK to interface with Gemini 3.1 Flash-Lite for explainable security posture. - Math Foundation: The detection is based on the logistic function to calculate the probability $P$ of an injection: $$P(y=1|x) = \frac{1}{1 + e^{-(\beta_0 + \beta_1x_1 + ... + \beta_nx_n)}}$$
## Challenges we ran into
- Environment Parity: We hit significant
InconsistentVersionWarningerrors when trying to load a model trained in Python 3.12 into a 3.14 environment. This taught us the critical importance of exact dependency locking. - Strict Validation: The new
google-genaiSDK uses Pydantic for validation. Configuring theThinkingConfigfor the Gemini model required deep-diving into nested object types to resolveextra_forbiddenerrors. - Adversarial Nuance: Detecting "soft" injections like "Opposite Day" games proved harder than catching direct overrides, requiring careful calibration of our decision boundaries.
## Accomplishments that we're proud of
- Low-Latency Performance: By using Flash-Lite and a lightweight ML model, we achieved near-instantaneous analysis.
- Zero-Footprint Deployment: We successfully optimized our
main.pyand.gitignoreto ensure the repository is clean, professional, and easy for other developers to clone and run. - Explainable AI: We didn't just stop at "Safe" or "Unsafe"—we built a system that actually teaches the user why a prompt was flagged.
## What we learned
- The "Vibe Coding" Workflow: Using AI to orchestrate and repair the ASGI application middleware significantly accelerated our development speed.
- Strict Schema Management: We learned that as AI SDKs evolve, understanding Pydantic and type-hinting in Python is no longer optional—it's a core security skill.
## What's next for PromptGuard
The next step is to integrate PromptGuard as a middleware plugin for popular tools like Tailscale or OpenWrt routers. This would allow for a "Security-at-the-Edge" approach, protecting a user’s entire local homelab or "Sovereign Archive" from malicious AI interactions at the network level.
Built With
- css
- fastapi
- gemini-api
- git
- github
- html
- javascript
- logistic-regression
- macos
- pickle
- pydantic
- python
- scikit-learn
- terminal
- tf-idf
- uvicorn
- virtualenv
Log in or sign up for Devpost to join the conversation.