PromptGuard

Prompt Page
Prompt Output page
Dashboard [Graph]
Dashboard Prompt Score

Inspiration

As LLMs are increasingly deployed in real products, we noticed a gap between model-level safety and production-level control. While modern models include internal safeguards, teams lack visibility into prompt behavior, drift, and risk over time. We wanted to build a system that treats AI safety as infrastructure, not an afterthought.

What it does

PromptGuard is a model-agnostic safety and observability layer for LLMs. It intercepts prompts and responses in real time, evaluates safety and hallucination risk, visualizes trends in a dashboard, and automatically alerts or blocks high-risk outputs before they reach users.

How we built it

We built PromptGuard using Next.js and TypeScript for the frontend and API layer, Supabase for persistent storage and audit logs, and Ollama for running local LLMs. We used the Gemma model for text generation and a separate embedding model for risk analysis groundwork. The system combines rule-based blocking with scoring heuristics to make fast, explainable safety decisions.

Challenges we ran into

One major challenge was relying solely on LLM-based safety scoring, which proved unreliable for consistent blocking. We solved this by introducing deterministic guard rules for illegal or high-risk intent. Another challenge was visualizing risk trends meaningfully, which required filtering and scaling data correctly to avoid misleading flat graphs.

Accomplishments that we're proud of

We successfully built an end-to-end safety layer that can detect, alert, and block unsafe prompts in real time. The project includes a polished dashboard, real-time risk trends, and clear guard statuses, making it usable as a real internal AI safety tool rather than just a demo.

What we learned

We learned that AI safety in production requires more than just trusting model-level safeguards. Combining deterministic rules, scoring heuristics, and human-visible dashboards creates more reliable and auditable systems. We also learned the importance of clear UX when communicating risk.

What's next for PromptGuard

Next, we plan to add multi-model comparisons, configurable safety thresholds, prompt drift detection, and alert resolution workflows. Long-term, PromptGuard could integrate with external notification systems and support multiple projects and teams from a single control plane.

Built With

nextjs
ollama
supabase

Updates

Krishnamohan Yagneswaran started this project — Jan 10, 2026 02:09 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.