Inspiration

We noticed that AI is a super powerful tool that can really help humanity. However, we also realized that people can use this amazing technology to hurt others and attack organizations. AI is incredibly smart, but it can be dangerous in the wrong hands. Since AI is being put into almost everything today, we need to make sure these systems are actually safe and secure.

What it does

Our project, PromptForge, is a security scanner for AI chatbots. It tests a chatbot by hitting it with a battery of over 100 tricky text attacks from our custom library. After the attacks run, a second local AI acts as a judge to automatically check if the chatbot stayed safe or broke its rules. The live results stream right onto a clean, real-time cybersecurity dashboard.

How we built it

Our team of four split up the work to build this from scratch and keep things moving fast. We used Python for the backend engine to run the main attack loop and talk to the different AI models. Instead of using cloud APIs that risk data privacy, we used Ollama to run open models like Qwen 2.5, Llama 3, and DeepSeek completely offline on our own machines. We put together an attack library of 100+ prompt injection tricks, and we built a dark-mode dashboard with orange accents to show the data coming in live.

Challenges we ran into

Our biggest problem was getting the AI judge to work right. Tricky prompts are designed to be confusing, so a basic keyword filter just didn't work. We spent a ton of time tweaking the prompt for our local AI judge so it could accurately tell if the chatbot actually leaked secrets or stayed secure. We also ran into lag because running 100+ heavy attacks locally via Ollama takes up a lot of computer power. We had to optimize our backend code loops so everything would run smoothly without crashing our laptops.

Accomplishments that we're proud of

We are really proud that we got everything running 100% locally without sending a single piece of private data to an external server. The dynamic dashboard looks awesome too—it feels like a real enterprise security tool with its live color-coded status badges and risk index. Ultimately, we turned a slow, manual testing process into a standardized, one-click automatic scanner.

What we learned

We learned a ton about how LLMs are built and why prompt injection is the number one AI security risk. Because AI processes instructions and user messages in the exact same window, making a perfect defense is incredibly hard. On the technical side, we got great hands-on experience running local models, connecting APIs with Python, and working together under a tight hackathon deadline.

What's next for PromptForge

Next, we want to add multimodal testing so we can scan for hidden attacks inside images and audio files. We also want to turn PromptForge into a GitHub Action. This would allow developers to automatically test their AI guardrails every single time they update their code before it goes live to production.

Built With

Share this project:

Updates