Inspiration

A lot of energy goes into building LLM agents, but very little into testing how easily they can be broken. We wanted to create a tool that continuously stress-tests AI agents using real-world attack patterns and automatically improves their defenses. RainforceN8N was born from the idea that AI security should be proactive, not reactive.

What it does

RainforceN8N is an autonomous workflow that evaluates and reinforces the security of LLM agents. It is created to:

  • Searches the web for prompt injection and jailbreak examples
  • Uses an LLM to generate attack prompts based on those examples
  • Runs those attacks against a target agent and logs any failures
  • Analyzes failure patterns to generate system prompt or config updates
  • Automatically patches the agent’s configuration with those updates
  • Sends a summary of results and an evaluation
  • Optionally tests more realistic multi-turn attack scenarios

How we built it

We built RainforceN8N entirely on n8n, using:

  • HTTP nodes and Exa for sourcing public prompt hacks
  • OpenAI LLM nodes to generate new attack prompts and to simulate agent responses
  • JavaScript function nodes to assess vulnerabilities in those responses
  • Supabase integration for database
  • Agents to propose evaluations and strategies to fix
  • A display dashboard to check statistics, download sets, and results
  • Outcome reports in real time

Challenges we ran into

  • Structuring prompts that generated diverse but targeted attacks
  • Designing logic to programmatically detect different types of agent failures
  • Keeping the system modular while adding optional features like multi-turn testing
  • Managing input and output formatting across several chained LLM nodes in n8n

Accomplishments that we're proud of

  • Stress test and prompt injection scraping tool, from established databases.
  • Automated workflow for trying our agents against these attack patterns.
  • Automated mitigation strategy generation, and result evaluation.
  • Automated weakness analysis, and testcase demonstration.
  • Database creation for custom failure modes.
  • Defining specific failure modes for different agents.

What we learned

  • It is hard to not trick the old, weak agents with our own system. Since we are dealing with injections and security issues, we have broken our own flow many times.
  • It was difficult to break more advanced models, and also, some attack patterns immediately cause rejection in the flow (That is why pictures have OpenAI models instead of Anthropic, Anthropic had advanced models that constantly rejected the injection attempts and we wanted to demonstrate failures.). But overall, it was super fun to try to break something we use daily!

What's next for RainforceN8N

  • We plan to implement multi-turn attacks instead of single jailbreak prompts and attack-types.
  • Features to contribute your own attack patterns, or features to contribute to a public dataset of attacks. Overall, the next goal is to try to break every agent we come across.

video demo link: https://www.loom.com/share/bdde60e464944ec4b73f8ce3b5354be4?sid=bbcb26f7-3d8f-40b8-b180-6cb64d0fe471

Built With

Share this project:

Updates