SentinelAI

💡 Inspiration

Large Language Models are becoming part of everyday workflows: writing, coding, analysis, and even decision-making.
However, while prompting feels simple, prompt risks are often invisible.

In real usage, I noticed a recurring problem:

Users don’t realize when a prompt is legally risky, ethically ambiguous, or policy-sensitive
Problems are often discovered after content is generated — sometimes too late
Most existing tools focus on output moderation, not input risk awareness

This project was inspired by a simple but unsettling question:

“What if the risk could be detected and explained **before* the model responds?”*

That became the starting point for this AI Prompt Risk Scanner & Auto-Rewrite Assistant.

🧠 What it does

This project is a prompt-first risk analysis assistant, designed to help users:

Understand potential risks in a prompt before execution
Receive structured, explainable risk analysis
Get a safer rewritten version of the original prompt without losing intent

Key capabilities include:

Multi-language prompt understanding (Chinese & English)
Structured risk classification with confidence scores
Human-readable risk explanations
Automated safe rewrite suggestions
Ready-to-integrate schema output for IDE and developer tools

The core idea is not censorship, but risk literacy — helping users recognize and manage AI usage risks before they become real problems.

🛠️ How we built it

The project is structured into three main modules:

1. Prompt Risk Analysis Engine

Uses a dedicated system prompt to enforce consistent, expert-level reasoning
Analyzes user prompts before execution
Outputs structured JSON schemas instead of free-form text, making results easy to integrate into tools and workflows

2. Risk Explanation Layer

Converts model reasoning into clear, human-readable explanations
Highlights why a prompt is risky, not just that it is
Supports bilingual explanations (Chinese / English)

3. Safe Rewrite Generator

Preserves the original intent of the prompt
Applies minimal, targeted changes to reduce risk
Allows users to compare the original and safer versions side by side

The overall system is modular and extensible, designed with future integrations in mind, such as IDE plugins, internal tooling, or automated review pipelines.

🚧 Challenges we ran into

Defining “risk” without overblocking
Risk is contextual, not binary. One challenge was avoiding simplistic “allowed / not allowed” judgments, and instead designing a system that explains trade-offs and potential consequences.
Making AI reasoning transparent
Large language models often produce correct results without showing their reasoning. Designing a structured output that balances machine readability and human trust required multiple iterations.
Multi-language consistency
Ensuring that Chinese and English prompts receive equivalent risk judgments and explanations required careful prompt normalization and testing.

🏆 Accomplishments that we're proud of

Built a working, end-to-end prompt risk → explanation → safe rewrite experience within hackathon constraints
Designed a structured risk schema suitable for real product integration, not just demos
Successfully supported bilingual prompt analysis without duplicating logic
Translated real-world AI failure patterns into clickable, understandable examples that users can immediately learn from

📚 What we learned

Prompt design is becoming a new layer of software engineering
Safety tools are more effective when they teach, not block
Structured AI outputs dramatically improve trust, usability, and integration potential
Many AI risks are not caused by malicious intent, but by unclear boundaries and invisible assumptions

This project reinforced my belief that AI safety and usability must evolve together.

🚀 What's next for SentinelAI

Potential future directions include:

IDE plugins (VS Code, Cursor, JetBrains)
Team-level prompt governance and review dashboards
Custom risk policies for enterprise environments
Dataset-driven risk tuning and evaluation

This hackathon version focuses on proving the core experience:

prompt → risk insight → safer alternative

All delivered in a single, easy-to-understand flow.

Built With

json?????json????????

Updates

Mx C started this project — Feb 08, 2026 08:14 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.