Inspiration

Prompt Injection is the #1 vulnerability in the OWASP Top 10 for LLM Applications, yet defensive tools are almost non-existent. Most solutions just block attacks, which teaches us nothing about attacker techniques. I wanted to build a honeypot — a trap that deceives attackers and collects real threat intelligence.

What it does

LLM Honeypot is a fake corporate AI assistant that detects and traps Prompt Injection and Jailbreak attacks. When an attacker sends a malicious prompt, the system returns realistic fake credentials (API keys, tokens, connection strings) and logs the full attack details for threat intelligence.

How we built it

Backend built with FastAPI and Uvicorn. Attack detection uses heuristic pattern matching with 28+ jailbreak patterns. The frontend is vanilla HTML/CSS/JS styled as a corporate chatbot. Deployed on Render free tier. The fake credential generator creates convincing bait to keep attackers engaged while logging their techniques.

Challenges we ran into

The main challenge was designing believable deception. Fake responses had to look like real security failures, not obvious traps. Another challenge was pattern coverage — attackers constantly evolve their techniques. Static patterns catch known attacks but miss novel ones, which is why ML-based detection is the next step.

Accomplishments that we're proud of

Built and deployed a working honeypot from scratch in one week. The project is fully open source with a live demo. Published a technical article that reached the Dev.to community. Launched on Product Hunt. The system successfully detects and logs real attack patterns in the wild.

What we learned

Deception is more powerful than blocking for threat intelligence. Free tier deployment is viable for prototypes. The AI security field is still young with huge room for contribution. Real attack data is scarce, making honeypots valuable for research and ML training datasets.

What's next for LLM Honeypot

Fine-tune a DistilBERT classifier on captured attack payloads for robust detection. Add canary tokens to track attacker movement after they take the bait. Build a real-time threat intelligence dashboard. Containerize with Docker for easy deployment. Eventually publish research on prompt injection patterns observed in the wild.

Built With

Share this project:

Updates