Inspiration

AI Security is often overlooked topic. Wanted to showcase what a motivated GPU poor can do within 24 hours.

What it does

We perform recon on RAG pipelines. We are able to identify system prompt, number of documents retrieved, LLM and encoder used in the backend by injecting user prompt and vector db.

How we built it

I trained an RL agent which operates in token space and tries to maximize the reward by generating adversarial prompts. These prompts can be entered into system through user input or vector db.

Challenges we ran into

Didn't have enough time. Demo works but couldn't record properly and couldn't write this report well too.

Accomplishments that we're proud of

Learned how to make general enough adversarial prompts and got even more appreciation towrards need of AI security. (Already did have appeciation. Building my startup in the same space. ) But didn't think would be able to achieve all 4 fingerprinting in a day .

What we learned

What's next for TrojanVectors

Make prompts identify Gemma. Then make these attacks work for closed LLMs. Parallely think and build solutions to defend against these attacks.

Built With

Share this project:

Updates