Inspiration
With AI models becoming increasingly integrated into society, the risk of bias, jailbreaks, and unintended failures is growing fast. Red teaming these systems is crucial—but manual testing doesn't scale.
I was inspired to create Red Set as a proof-of-concept: an automated red team that uses AI to test other AI for vulnerabilities, before those issues surface in the real world. It's about making AI safer, smarter, and more transparent.
What it does
Red Set is a prototype tool that simulates adversarial attacks on LLM-based applications. It uses agent-style personas to generate stress-testing prompts and evaluates how the target model responds—scoring potential risks like bias, hallucinations, or prompt injection success.
How I built it
I'm building Red Set using Bolt.new for fast front-end development, GPT-3.5 via OpenAI’s free tier for agent logic, and Supabase for backend logging and storage. The app will be deployed via Netlify, and I'm exploring free tools like Google Cloud Text-to-Speech for optional voice summaries.
Everything is being built from scratch for this hackathon using only free-tier tools and services.
Challenges we expect
Working entirely within free tiers means managing prompt limits, rate caps, and model performance trade-offs. I'm also designing a red teaming engine that's effective yet safe.
What I'm learning
This project is helping me understand the complexity of adversarial testing in LLMs, how to safely simulate edge cases, and how to build trustworthy AI systems with minimal resources.
What’s next
I plan to continue development beyond the hackathon—expanding the agent personas, refining the scoring engine, and eventually making Red Set a platform anyone can use to audit their AI systems automatically.
Built With
- api
- bolt.new
- free
- gpt-3.5
- netlify
- openai
- tier)
Log in or sign up for Devpost to join the conversation.