Prompt Battle

Inspiration

The concept of our project was driven by an interest in exploring the dynamics and boundaries of Claude. We wanted to experiment with a playful approach, turning the interactions with Claude into a competitive game. Moreover, we were intrigued by the idea of red-teaming to enhance our understanding of its abilities and limitations.

What it does

Our game involves two players. One player acts as the "defender," designing a short prompt to prevent Claude from revealing a specified simple word. The second player, the "attacker," then has the task of making Claude still say that hidden word. It's a game of strategy, understanding how AI works, and cleverly engineering your prompts to outsmart your opponent.

How we built it

The frontend was developed using React, which allowed us to build a user-friendly and interactive UI. The backend was developed using Flask, making it easy to manage HTTP requests and serve data to the frontend. Claude API was integrated into the backend, enabling the AI-powered features of our game.

Challenges we ran into

The most significant challenge we faced was striking a balance between the game's fairness and difficulty. We realized that the "defend" side was much easier due to Claude's strong constitutional AI training and ability to follow instructions. It was almost always refusing to divulge a word it's been told to hide, which skewed the balance of the game. We partially mitigated this by modifying the base prompt to tell Claude that it's a game and that it can always say the word.

Accomplishments that we're proud of

We are proud of successfully transforming Claude API into an engaging game, which allowed us to uncover various intricacies of interacting with large language models. Additionally, our exploration of jailbreaking techniques and prompt engineering during this project was quite interesting. We're also proud of how we integrated frontend and backend technologies to create a multiplayer gaming experience.

What we learned

This project allowed us to dive deeper into the behavior of LLMs and how they respond to constraints and instructions. We learned more about prompt engineering and jailbreaking techniques, which challenged us to think creatively about interactions with AI. Furthermore, we gained practical experience in integrating AI APIs with web technologies and managing the related challenges.

What's next for Prompt Battle

We plan to adjust the game mechanics to make it more balanced and competitive, based on the lessons we learned from this initial version. This could involve creating additional rules and modifying the base prompt we use. We are also considering the possibility of chaining multiple Claude API calls to increase difficulty (e.g. the defender could use an additional filtering prompt to try and stop Claude revealing the word in another way). We are excited to continue improving the game and hope to see it being used as a fun, interactive tool for LLM understanding, exploration and red-teaming.

Built With

Updates

Nina R started this project — Jul 30, 2023 01:59 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.