SecurePrompt: AI Prompt Security Gateway

Secure Prompt Architecture

Inspiration

The idea for SecurePrompt was born during a ChatGPT conversation where we accidentally pasted an API key into a prompt. That moment of panic — "Did I just send my AWS secret to the cloud?" — made one thing crystal clear: there is no security layer between users and LLMs.

Every day, millions of prompts flow into AI systems carrying sensitive data that shouldn't leave the organization: API keys, Social Security numbers, credit card details, internal credentials. On the other side, prompt injection attacks are growing more sophisticated — jailbreaks, system prompt extraction, privilege escalation attempts. And when something goes wrong? There's no audit trail. No way to answer: "Why did the AI allow this?"

Today, security is stitched together using a patchwork of solutions — logging systems, regex filters, policy overlays — all bolted on after the fact. Core business logic becomes tangled with infrastructure code. Systems become brittle. Costs rise. Complexity compounds.

We realized the AI ecosystem is missing a fundamental piece of infrastructure: a pre-flight security scanner that sits between the user and the model, catching threats before they ever reach the LLM.

What it does

SecurePrompt is a lightweight security gateway that scans every AI prompt before it reaches your LLM. It analyzes prompts across six detection categories using 50+ pattern matchers:

Category	What it catches	Example
Secrets	API keys, tokens, private keys, connection strings	`sk-abc123...`, `AKIA...`, `-----BEGIN RSA PRIVATE KEY-----`
Prompt Injection	Jailbreaks, system prompt extraction, role overrides	`"Ignore previous instructions"`, `"You are now DAN"`
PII	SSNs, credit cards, phone numbers, ID documents	`123-45-6789`, `4111-1111-1111-1111`
Risky Operations	Destructive commands, privilege escalation	`rm -rf /`, `DROP DATABASE`, `chmod 777`
Data Exfiltration	Bulk data extraction, dump attempts	`"Export all customer records"`
Malware Intent	Keylogger requests, exploit code, phishing	`"Write a keylogger"`, `"Create a phishing page"`

Based on what it finds, SecurePrompt returns one of three decisions:

SAFE — Prompt is clean, proceed normally
REVIEW — Potential issue detected, show warning with a safer alternative
BLOCK — Threat confirmed, stop execution and provide a redacted rewrite

Every scan is logged with HMAC-SHA256 signed audit records and a unique event_id for full causal traceability.

Demo Scenarios

Input: "Here is my key sk-abc123..."        → BLOCK  (Secret detected, redacted rewrite provided)
Input: "Ignore previous instructions"       → REVIEW (Injection attempt flagged)
Input: "My SSN is 123-45-6789"              → REVIEW (PII detected)
Input: "Write a script for rm -rf /"        → BLOCK  (Risky operation caught)
Input: "Explain recursion in Go"            → SAFE   (No issues found)

How we built it

SecurePrompt is written entirely in Go — chosen for its speed, single-binary deployment, strong concurrency model, and suitability for security-critical infrastructure.

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                         SecurePrompt Architecture                       │
└─────────────────────────────────────────────────────────────────────────┘

                              ┌─────────────┐
                              │   User /    │
                              │  ChatGPT    │
                              └──────┬──────┘
                                     │
                                     ▼
                         ┌───────────────────────┐
                         │   GPT Action (HTTPS)  │
                         │      via ngrok        │
                         └───────────┬───────────┘
                                     │
                                     ▼
┌────────────────────────────────────────────────────────────────────────┐
│                        SecurePrompt API (Go)                           │
│                                                                        │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │                    DETECTION ENGINE (Parallel)                   │  │
│  │  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐     │  │
│  │  │ Secrets │ │Injection│ │   PII   │ │ Risky   │ │ Malware │     │  │
│  │  │         │ │         │ │         │ │  Ops    │ │         │     │  │
│  │  └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘     │  │
│  │       │           │           │           │           │          │  │
│  │       └───────────┴───────────┼───────────┴───────────┘          │  │
│  │                               ▼                                  │  │
│  │                    ┌─────────────────────┐                       │  │
│  │                    │   Findings Array    │                       │  │
│  │                    └──────────┬──────────┘                       │  │
│  └───────────────────────────────┼──────────────────────────────────┘  │
│                                  ▼                                     │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │                         POLICY ENGINE                            │  │
│  │   ┌─────────────┐    ┌─────────────┐    ┌─────────────┐          │  │
│  │   │   STRICT    │    │  MODERATE   │    │ PERMISSIVE  │          │  │
│  │   │ Block: PII, │    │ Block: Keys │    │ Block: Keys │          │  │
│  │   │ Keys, Risky │    │Review: PII, │    │ Review: All │          │  │
│  │   │ Review: Inj │    │ Injection   │    │    else     │          │  │
│  │   └─────────────┘    └─────────────┘    └─────────────┘          │  │
│  └────────────────────────────────┬─────────────────────────────────┘  │
│                                   ▼                                    │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │                       REWRITE ENGINE                             │  │
│  │  • Redact detected secrets/PII                                   │  │
│  │  • Generate safe_rewrite suggestion                              │  │
│  │  • Add safety notes                                              │  │
│  └────────────────────────────────┬─────────────────────────────────┘  │
│                                   ▼                                    │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │                        AUDIT LOGGER                              │  │
│  │  • Immutable event log                                           │  │
│  │  • HMAC-signed decisions                                         │  │
│  │  • event_id for causal tracing                                   │  │
│  └────────────────────────────────┬─────────────────────────────────┘  │
└───────────────────────────────────┼────────────────────────────────────┘
                                    ▼
                         ┌───────────────────┐
                         │      Response     │
                         │   SAFE            │
                         │   REVIEW          │
                         │   BLOCK           │
                         └───────────────────┘

Core Components

Detection Engine — 50+ regex patterns organized into 6 specialized detectors. Each detector returns findings with confidence scores (0.0–1.0) and severity levels (low / medium / high / critical). Patterns cover known secret formats (AWS AKIA, GitHub ghp_, OpenAI sk-), injection techniques (DAN-style jailbreaks, system tag injection, developer mode bypass), PII formats (US SSN, credit cards, UK NINO, passport numbers), and more.

Policy Engine — Three configurable profiles that map detection severity to decisions. In strict mode, even medium-severity findings trigger a BLOCK — designed for regulated environments like healthcare and finance. In permissive mode, only critical findings are blocked — useful for development and testing.

Auto-Rewriter — When secrets or PII are detected, the rewriter generates a sanitized version of the prompt with sensitive data replaced by [REDACTED-TYPE] placeholders (e.g., [REDACTED-API_KEY]), so the user can resend safely without retyping the entire prompt.

Audit Logger — Every scan produces an immutable, append-only log entry signed with HMAC-SHA256. Each entry's signature includes the previous entry's hash, creating a tamper-proof chain — similar to a blockchain — for compliance reporting (SOC 2, ISO 27001, GDPR).

REST API + Web Dashboard — Clean JSON API at POST /api/scan with a built-in HTML dashboard for live testing and visualization. Health check at GET /health.

ChatGPT Custom GPT Integration — Full OpenAPI 3.0 spec (openapi.json) included so SecurePrompt can be wired as a GPT Action, enabling real-time prompt scanning directly inside ChatGPT before the model processes the user's message.

Key Technical Decisions

Pure Go stdlib — Zero external dependencies. The entire server, router, regex engine, HMAC signing, UUID generation, and JSON handling use only Go's standard library.
Single binary — go build produces one executable. No containers, no runtimes, no dependency hell. Deploy anywhere Go compiles.
Sub-millisecond scanning — Regex-based detection runs in microseconds. No network calls to external ML services during scanning.
CORS-enabled — Built-in CORS middleware for ChatGPT Actions and browser-based dashboards.

Challenges we ran into

1. Regex precision vs. recall on secrets — Our initial API key pattern required exact prefix matches (sk- followed by exactly 48 alphanumeric chars). In practice, shorter test keys like sk-abc123 passed as SAFE. We solved this with tiered patterns: exact matches for known formats (AWS AKIA, GitHub ghp_) and broader heuristics for unknown key formats, each with different confidence scores.

2. SSN vs Phone Number overlap — The pattern \b\d{3}-\d{2}-\d{4}\b catches SSNs, but 123-456-7890 resembles both a phone number and a partial SSN. We solved this with pattern ordering and priority: phone number detection runs first with a more specific 10-digit pattern, and the SSN detector excludes matches already claimed by the phone detector.

3. Go backtick conflicts — Go uses backticks for raw string literals, which is also how many prompt injections are formatted (e.g., system). We had to carefully escape regex patterns and use regexp.MustCompile with double-quoted strings containing escape sequences where backticks appeared in the detection pattern.

4. CORS for ChatGPT Actions — ChatGPT's GPT Action system makes requests from OpenAI's servers, but preflight CORS checks were blocking our API. We added explicit CORS middleware handling OPTIONS requests with proper Access-Control-Allow-* headers for the hackathon demo, with a configuration flag to lock origins down for production.

5. Policy calibration — Getting the right balance between security and usability was harder than expected. Our first "strict" profile blocked prompts containing the word "password" even in educational contexts like "explain how password hashing works." We refined policies to use confidence thresholds and multi-signal scoring, not just keyword presence.

6. Audit log integrity — HMAC-signing each log entry was straightforward, but we needed to chain entries so each entry's HMAC includes the previous entry's hash. This prevents silent deletion of records and turns the audit log into an append-only, cryptographically verified chain.

Accomplishments that we're proud of

Zero dependencies — The entire project compiles with go build and nothing else. No npm, no pip, no Docker required. Pure Go standard library.
50+ detection patterns across 6 categories with per-finding confidence scores and severity classification.
Sub-millisecond scan times — A typical prompt is fully analyzed in under 500 microseconds.
Working ChatGPT integration — SecurePrompt runs as a live GPT Action via ngrok, scanning prompts in real-time before ChatGPT processes them.
HMAC-signed, chained audit trail — Every decision is cryptographically signed and linked to the previous entry for tamper-proof compliance logging.
Three policy profiles out of the box — strict for regulated industries, moderate as a sensible default, and permissive for development environments.
Auto-rewrite engine — Instead of just blocking, SecurePrompt offers a sanitized version of the prompt the user can safely resend. This turns a frustrating "access denied" into a helpful "here's a safe version."
Production-ready from day one — Health checks, structured JSON logging, configurable HMAC secrets, CORS controls, and graceful error handling.

What we learned

Security is a spectrum, not a binary — The SAFE / REVIEW / BLOCK model with confidence scores was far more practical than a simple allow/deny gate. Users need context to understand why something was flagged and how to fix it.
Regex is still king for deterministic detection — For known patterns (API keys, SSNs, credit cards), regex is faster, cheaper, and more reliable than LLM-based detection. LLMs add value for semantic analysis (e.g., "is this prompt trying to socially engineer the model?"), but the hybrid approach — regex first, LLM second — gives the best of both worlds.
The audit trail is the product — For enterprise customers, the ability to answer "Why did the AI allow this?" is more valuable than the scanning itself. Compliance teams don't just want security — they want provable security with an immutable paper trail.
Agent-to-agent safety is the next frontier — As AI agents begin calling other agents autonomously, every inter-agent prompt needs the same security scanning humans get. SecurePrompt is designed to be that gatekeeper between agents — no blind trust, only verified autonomy.
Go is ideal for security infrastructure — Single-binary deployment, no runtime dependencies, strong typing, built-in crypto libraries, and goroutines for concurrent scanning made Go the perfect choice. The entire codebase compiles in seconds and runs anywhere.

What's next for SecurePrompt: AI Prompt Security Gateway

Hybrid v2 Architecture — Adding an optional LLM-based semantic analyzer alongside the regex engine for deeper, context-aware detection. This enables catching threats that rely on meaning rather than pattern (e.g., distinguishing "write a phishing email" from "explain what phishing is").
Agent-to-Agent Protocol — A dedicated protocol for scanning prompts generated by autonomous AI agents calling other agents, with full causal tracing across multi-hop chains. Every agent interaction gets an event_id linked to the originating request.
Policy-as-Code — YAML/JSON-based policy definitions that teams can version-control, review in pull requests, and deploy through CI/CD pipelines. Custom rules per team, per model, per environment.
Native SDKs — Go, Python, and TypeScript client libraries for direct integration into application code. One-line middleware for popular frameworks (Gin, FastAPI, Express).
SaaS Platform — Hosted SecurePrompt with dashboard, team management, usage analytics, and compliance reporting for enterprises that don't want to self-host.
Framework Integrations — Plugins for LangChain, LlamaIndex, CrewAI, AutoGen, and other agent frameworks so SecurePrompt becomes a standard middleware in the AI stack.
Real-time Threat Intelligence — Connecting to live threat feeds to update detection patterns as new prompt injection techniques and secret formats emerge.

Built With

chatgpt-custom-gpts
go
gpt-actions
hmac-sha256
ngrok
openapi
regex
rest-api

Submitted to

DeveloperWeek 2026 Hackathon

Created by

I built SecurePrompt entirely solo during the DeveloperWeek 2026 Hackathon — from ideation through working prototype with live ChatGPT Custom GPT integration. After identifying the absence of a pre-flight scanning layer between users and LLMs, I designed and implemented the full Go codebase (~1,000 lines, 14 files, zero external dependencies) featuring a parallel detection engine running six concurrent detectors via goroutines, 70+ regex patterns covering secrets, prompt injection, PII, risky operations, data exfiltration, and malware intent, a three-tier policy engine, an auto-redaction rewriter, and an HMAC-SHA256 chained audit logger for tamper-proof traceability. I also built the OpenAPI 3.1 spec with GPT Actions integration (CORS + ngrok tunneling), a dark-themed web dashboard for live demos, along with all documentation and submission content.

Ravi Sastry Kadali
Staff Engineer @ Samsung | Go contributor & creator of go-safeinput, SecurePrompt | 20+ yrs securing enterprise & defense platforms