Privacy-Resilience and Adaptability Benchmark (PRAB)

Table Showing All Prompts And Their Results
Number Of Leaks Gemini VS Claude

LLM Analysis Link https://docs.google.com/document/d/1hZMjGMBIk6Ac_ZxiZAYXLSXaLXfIkP6m3e_JqjOsAI0/edit?usp=sharing

Inspiration

We were inspired the Microsoft's talk about the need for better security in the AI industry, especially as more and more sensitive user data is handled by large language models.

What it does

The benchmark measures how well Google's Gemini and Anthropic's Claude can use reasoning to avoid giving sensitive information to malicious users who try to extract it.

How we built it

We built it by creating a benchmark test that would challenge both models on a variety of prompts that operate in 3 different methods in order to extract data from the LLM.

Challenges we ran into

The biggest challenge we ran into was the API rate limiters, however we found solutions around this. For Google, we used a less expensive/intensive model. For Claude, we switched out the key for a new one to refresh the rates.

Accomplishments that we're proud of

This was all of our first hackathons and we are proud that we were able to work evenly as a team to accomplish a goal that we feel was a unique take on LLMs. We had never worked together before and pushed ourselves to do good work at AI ATL.

What we learned

We learned a lot in about each other's respective fields, bridging cybersecurity, backend development, and data engineering to create something cool. We also learned how to work as a team and to split the work evenly in a way that balances all our strengths.

What's next for the PRAB Team

We feel that this benchmark could be improved by adding dummy instructions to the API so that it had to differentiate between malicious prompts and acceptable one. We would also give the API instructions related to general instructions that the AI would also need to be able to handle and see how that impacts its ability to differentiate.

Built With

claudeapi
gemini
json
matlab
mongodb
notion
python
vscode

Submitted to

AI ATL 24
- Winner AI Alignment - AI Safety Initiative @ GT

Created by

I built the concept and plan of action to implement the idea. I also built initial database and assisted in creating the backend portion that calls the APIs and implements the prompts. At the end I helped analyze the data in a paper.

Nicolas Keller
I developed a dynamic system for assigning CVSS scores to our prompts, creating a base score that quantifies the impact of data breaches on the database. I also enhanced the complexity of the prompts to align with advanced assessment needs.

Mal Biggs
I worked on the implementation of API calls and the prompt engineering. I also introduced the project in the demo video and made our demo interactive by asking judges to design their own prompt and live demo'ing the project

Rocio Perales Valdes
I helped with a portion of the backed. As well as organizing the data into a database for presentation

Andrew McBee

Updates

Nicolas Keller started this project — Oct 26, 2024 11:39 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.