Inspiration

We are a group of students deeply interested in the AI safety space. We came across GCG whilst reading a groundbreaking paper and wanted to utilize in some way within our project. We took a lot of inspiration from LLM chains as well.

What it does

We explore string attacks on LLMs and highlight its ease and dangers whilst doing so. We implement a novel approach to combat these attacks using LLM chains.

How we built it

We built the front-end with React and utilized Flask for back-end logic and hosting. We rented compute from vast.ai to run the GCG algorithm.

Challenges we ran into

The biggest challenge we ran into was coming up with the idea of LLM chaining. We were stuck for quite a while simply in the brainstorming phase while thinking of applications we could do as well as searching for potential mitigation solutions.

Accomplishments that we're proud of

We are especially proud of our front end and our integration with cloud compute (GPU). The first took careful planning and extreme dedication. We are especially happy with how it turned out, as well as extremely happy with our front end engineers. Additionally, we were also happy with being able to learn about how to utilize online GPU's to run algorithms.

What we learned

Many of us were able to learn valuable skills from each other. We were able to learn React from our front end engineer. Some were experienced in Flask, so we were able to glean some information. Our ML engineer was also able to SSH into cloud compute as well!

What's next for SecureAI

- We are looking to host it live!
- Can be further optimized with more compute / better resources.
- Can be further improved by finetuning specific LLM's for the task of removing adversarial strings.
- Want to make website such that we can show adversarial string being deleted in each LLM run in the future 
- Want to look into Anthropic’s Claude and how they are using Constitutional AI and RLHF

Built With

Share this project:

Updates