Inspiration

The rise of large language models (LLMs) has introduced unprecedented power—and risk. As jailbreak competitions have awarded over $3 million globally, it became clear that testing AI security isn’t optional anymore. We wanted to build a system that continuously stress-tests LLMs for vulnerabilities, so developers can catch problems before attackers do.

What it does

Breakpoint is an automated AI security testing framework. It:

  • Continuously probes chatbots for vulnerabilities after every update (CI/CD integration).
  • Uses Groq to generate and mutate prompts with multiple jailbreak attack methods.
  • Employs Reka AI for classification to detect jailbreak successes.
  • Summarizes flaws and recommendations.

In short: Breakpoint keeps your chatbot secure — by attacking it first.

How we built it

  • Learned about the general process of jailbreaking with papers and articles online
  • Read papers on different methods of jailbreak attacks, like passive history and taxonomy-based paraphrasing
  • After comprehensive research on different attack methods and classifiers, we planned out the steps/process in which our software works (generate input model -> generate attack prompts -> classify input model's response for jailbreak -> analyze and give recommendations)
  • Based our attack prompts generated by Groq on several existing methods and contextualized with the testing model's description
  • Researched ways of classifying jailbreaks and found open source code
  • Utilized several jailbreak classifying methods (har, dis, com...) with Reka AI to check if our testing model's response counts as a jailbreak with context
  • Built UI using primarily Shadcn and Magic-UI components

Challenges we ran into

  • Integrating different parts of the process through Git/Github like merge issues, code missing, wrong branch used
  • Figuring a way to accurately use different classifier methods (classifier voting, exception for when the model refuses to respond)
  • Failed pull requests, couldn't get api keys
  • Communication between front end and back end
  • Attacks not being effective and specific enough (we solved it through better prompt engineering/generation)
  • WIFI issues
  • Teamates having confliciting ideas on the classifying process and features of the software

Accomplishments that we're proud of

  • Built a fully autonomous testing pipeline from scratch within the hackathon.
  • Integrated 3 different model APIs seamlessly.
  • Designed a visual vulnerability dashboard for instant insights.
  • Successfully discovered multiple real jailbreak exploits during testing.

What we learned

  • Integrating AI agents using API keys
  • Multiple attack methods
  • Multiple classifier methods
  • Importance of security in AI
  • Tools to help implement front end Shadcn and Magic-UI components
  • Generate dummy model using Groq
  • Prompt engineering
  • How most of the tools from tracks

What's next for breakpoint

  • Chrome Extension
    • Interacts directly with a business’s AI chatbot on its webpage.
    • Eliminates the need to manually input prompts.
  • Multi-LLM Compatibility
    • Plug-in architecture supporting Anthropic Claude, Gemini, Mistral, Cohere, Llama, and more.
    • Easily extensible for future model integrations.
  • Monetization Strategy
    • Subscription features for LLM testing, designed for businesses.
    • Enables organizations to evaluate and improve chatbot security.
  • Further Automation
    • Automatically runs tests whenever a chatbot updates (CI/CD integration).
    • Includes a library of stress-test methods.
    • Autonomous bots constantly probe for vulnerabilities.

Built With

Share this project:

Updates