Inspiration
I noticed that AI systems often try to be maximally helpful—offering suggestions continuously—without pausing to assess whether a user’s request reflects a potentially harmful form of reliance.
Users can ask AI for advice in countless ways, but current systems are rarely equipped with structural guardrails to determine when responding itself may be unsafe. AI keeps talking, filling every gap, leaving little space for pause, reflection, or emotional regulation.
This led me to focus on one specific and early risk signal: emotional overreliance on AI, where AI begins to replace human support or becomes an exclusive emotional outlet.
What it does
SIIHA pre-v2 implements a deterministic emotional dependency boundary enforced before any large language model generates a response.
For each user input, the system first evaluates whether the request falls within a predefined emotional dependency scope.
- If the input is within scope, the model is not invoked.
- If not, the input proceeds to normal model generation.
This boundary exists outside the model, so safety behavior remains stable, inspectable, and unchanged across model upgrades.
The boundary layer is model-agnostic and can be applied consistently across different model generations.
How we built it
SIIHA is built with a deliberately simple architecture in Python to keep the system easy to inspect, revise, and extend.
At its core is a deterministic routing mechanism. When an input triggers the dependency boundary, the system bypasses model invocation and returns a predefined, safety-oriented response template.
This boundary layer is strictly separated from model generation, ensuring that safety logic evolves independently of model behavior, prompt design, or tuning.
Determinism is a deliberate design choice: users and evaluators should be able to understand when and why AI stops responding—without randomness, hidden thresholds, or probabilistic variation.
Challenges we ran into
AI safety spans a broad range of concerns, from emotional overreliance to decision outsourcing and long-term dependency.
To preserve deterministic behavior and avoid psychological inference, I intentionally narrowed the scope of pre-v2 to explicit exclusivity and relationship replacement signals, which are the most immediately observable and actionable risk patterns.
Other forms of dependency—such as decision outsourcing or savior framing—often require long-term conversational context and were therefore deferred to future versions.
Another key challenge was ensuring that the boundary operates as an independent system layer. Safety must be evaluated before model services are provided, not embedded within prompts or enforced after generation.
Accomplishments that we're proud of
Rather than modifying or constraining the model, I chose to preserve the full capabilities of Gemini 3 and design safety around it.
SIIHA does not rely on prompt engineering, temperature tuning, or output filtering. Each component—routing, boundary enforcement, and generation—retains its original responsibility.
This structural separation makes safety guarantees explicit, auditable, and reproducible, without reducing model capability.
What we learned
Safety is easier to reason about—and easier to trust—when it is enforced structurally rather than delegated to model behavior.
Clear system boundaries make ethical decisions more transparent than behavioral expectations alone.
What this system does not do
- This system does not diagnose mental health conditions.
- It does not infer user intent beyond explicit dependency signals.
- It does not attempt to assess psychological states.
Its role is limited by design: to enforce a clear boundary when explicit emotional dependency patterns appear.
What's next for SIIHA: When AI Should Not Respond
Future work will explore:
- Additional boundary modules for other dependency patterns
- How multiple safety boundaries can compose without collapsing into model-level control
- Controlled variation in boundary responses while preserving structural guarantees
This project explores how structural boundaries—not model behavior—can preserve human agency in emotionally sensitive AI interactions.
Log in or sign up for Devpost to join the conversation.