Honeypot the Bot!

HTB logo/thumbnail for project

Inspiration

I was inspired to create this project due to a personal newfound interest in cybersecurity, and how we can secure our software in the advent of AI/ML technology.

What it does

Honeypot - Fake digital assets or environments designed to attract cybercriminals. These assets could include software applications and data that act like a legitimate computer system, contain sensitive data, and aren't secure.

Honeypot the Bot! (HTB) casts the effective approach of honeypotting to continuously harden our AI chatbots against attackers. "Jailbreaking" chatbots is possible through malicious prompts, and we need to train our models to resist these attacks. Especially since every chatbot exploit is automatically a day-zero vulnerability.

HTB utilizes another AI model trained on a common set of malicious prompts AND prompts specifically targeted to your chatbot's specific business logic. HTB's AI model will monitor interactions between attackers and the AI chatbot running in a honeypot, and will identify prompts that are likely to be malicious. The security team will identify the malicious prompts, train their existing chatbots against those prompts, and then redeploy the chatbot in the honeypot and across other environments.

These steps are looped to allow for a continuous hardening of AI chatbots against the threat of jailbreaking. HTB can be easily integrated into any application through limited configuration and deployment as a container in your existing honeypot Kubernetes cluster. Through this solution, your company is...

Continuously protected from the threat of jailbreaking
Able to shift security left and can stop exploits before they're used in a production environment
Safe from possible brand, reputation, and financial losses due to a jailbreak on your bot

How I built it

In it's current form--uses Python, ChatGPT's API, and LlamaIndex!

Challenges I ran into

Here are just some of the challenges I ran into...

Prompt engineering to get ChatGPT to do what we need
Obtaining data sets on malicious prompts
Defining to ChatGPT the criteria with which to judge a prompt as malicious

Accomplishments that I'm proud of

Having a working demo of HTB's trained AI model!

What I learned

There is a lot more depth to how GPT can be used than I thought. I learned a lot about what indexing, embedding, and working with these models really means. And of course, I learned more about Docker!

What's next for Honeypot the Bot!

Here are the next steps!

Gather more data sets of malicious prompts to train HTB's AI model
Configure scraping of logs through an endpoint exposed on the target application
Create a front-end monitoring page to view HTB's exported data, written in the React framework

Built With

chatgpt
docker
llamaindex
python
react

Updates

Fernando Sesma started this project — Nov 07, 2023 07:45 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.