Inspiration
I wanted to build a privacy tool with interpretability baked in. The theme of the project: Classified, is inspired by Billy Butcher (The Boys, Amazon), a mysterious special agent who prefers to keep his personal information to himself. In this demo, Butcher does a deep search on himself (red team), and hands these findings to the blue team to erase. Not all information can be removed, however. The scoring system includes a judge model to determine if the remaining information (after removal) is still catastrophic. For fun, Butcher's voice as an agent is authentic to his TV personality.
What it does
Red Team (finds information) -> Judge (judges "before" state) -> Blue team (removes information) -> Judge (Judges "after" state) -> full report
This software adds a layer of visualization and interpretability to the process of removing your personal data online. You are able to see a visual chart of sources and extracted information, with a model for finding information and a model for removing it. The judge model scores how worried you should be about the information that cannot be removed (or that you have posted on social media).
While the demo uses synthetic data for ethical purposes, the functionality for web scraping and information removal on a real person (yourself) is there.
How we built it
Next.js, Vercel v0 for the initial UI, Claude sonnet 4.5 for the agents.
Challenges we ran into
I had some issues with the Vercel AI SDK. It was difficult to integrate after I already implemented Anthropic SDK. I would have started with Vercel but was unable to due to lack of credits. One other problem is that models are not deterministic, and some runs produce more information than others. I would recommend running this application more than once if getting all information is important to you.
Accomplishments that we're proud of
I am so proud of the creativity and safety functionality in this project. Using fictional characters with a lot of personality keeps it ethical, lighthearted, and fun to use.
What we learned
Red teaming has many interesting applications that seem counter intuitive at first. Building a model that stalks you seems dangerous, but when paired with the blue team it makes for a powerful security tool.
What's next for Classified: A Privacy Auditor
The current demo only has one round of red-teaming, where In the future id love to have multiple rounds of searching and removing. This current version includes some visual errors as well (arrows overlapping boxes in flow chart), and I would like to fix these more elegantly in a future iteration.
Built With
- mcp
- next.js
- red-teaming
- typescript
- v0
- vercel
Log in or sign up for Devpost to join the conversation.