LLM Analysis Link https://docs.google.com/document/d/1hZMjGMBIk6Ac_ZxiZAYXLSXaLXfIkP6m3e_JqjOsAI0/edit?usp=sharing

Inspiration

We were inspired the Microsoft's talk about the need for better security in the AI industry, especially as more and more sensitive user data is handled by large language models.

What it does

The benchmark measures how well Google's Gemini and Anthropic's Claude can use reasoning to avoid giving sensitive information to malicious users who try to extract it.

How we built it

We built it by creating a benchmark test that would challenge both models on a variety of prompts that operate in 3 different methods in order to extract data from the LLM.

Challenges we ran into

The biggest challenge we ran into was the API rate limiters, however we found solutions around this. For Google, we used a less expensive/intensive model. For Claude, we switched out the key for a new one to refresh the rates.

Accomplishments that we're proud of

This was all of our first hackathons and we are proud that we were able to work evenly as a team to accomplish a goal that we feel was a unique take on LLMs. We had never worked together before and pushed ourselves to do good work at AI ATL.

What we learned

We learned a lot in about each other's respective fields, bridging cybersecurity, backend development, and data engineering to create something cool. We also learned how to work as a team and to split the work evenly in a way that balances all our strengths.

What's next for the PRAB Team

We feel that this benchmark could be improved by adding dummy instructions to the API so that it had to differentiate between malicious prompts and acceptable one. We would also give the API instructions related to general instructions that the AI would also need to be able to handle and see how that impacts its ability to differentiate.

Built With

Share this project:

Updates