ChaosAgent

Inspiration

Similar to Netflix's chaos monkey which deletes servers to test how the infrastructure handles such issues. I wanted to build something similar for agents. What if we can try to break the agent in a secure environment so that we can be aware of vulnerabilities before the agent hits production

What it does

An attacker agent generates numerous attacks ( prompt injection, tool manipulation, data leaks etc) and then tries to break the agent in a secure environment by spinning up and testing in daytona sandboxes.

How we built it

-Attack Library: Curated database of 35+ proven attack vectors across 5 vulnerability categories (prompt injection, tool manipulation, data leakage, resource exhaustion, session bleeding)

Attacker Agent V2: Advanced test generator using GPT-4o that creates context-aware attacks through multiple strategies - proven templates, diverse sampling, LLM-based adaptation, intelligent mutations, and parallel execution for speed
Target Agent: Vulnerable customer support agent with realistic tools (database queries, email sending) serving as our test subject
Chaos Executor: Orchestrates test execution with optional Daytona sandbox isolation and LLM-powered vulnerability evaluation for accurate detection

What's next for ChaosAgent

Finetune a model on the open data available on from hackaprompt, jailbreak etc to make a agent specifically for creating more robust and specific targetted attacks.

Built With

daytona
nextjs
typescript

Updates

Vishnu Dut Venkateshwaran started this project — Nov 15, 2025 06:19 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.