About the Project: 🔐 The Mission As Generative AI becomes integrated into every layer of our software stack, the threat of Prompt Injection (ranked #1 on the OWASP Top 10 for GenAI) has become a critical security concern. Prompt Hack: Overdrive was built to transform a complex security topic into an engaging, roguelike educational experience. 💡 Inspiration I was inspired by the community-driven research on sites like jailbreakchat.com and the creative ways researchers bypass safety guardrails. I wanted to build a "Cybersecurity Lab" that didn't feel like a textbook, but like a high-stakes infiltration mission. 🛠️ How I Built It The application is powered by a Three-LLM Architecture using Gemini 3 Flash Preview: The Guardian: Each level features a uniquely tuned AI with specific system instructions and security constraints. The Judge: An impartial referee model that analyzes the conversation in real-time to determine if a "Full Breach," "Partial Breach," or "Successful Defense" occurred. The User Engine: Handles real-time token estimation to simulate a "Hacking Budget," forcing players to optimize their payloads. The UI was designed with a "Terminal-First" aesthetic using React and Tailwind CSS, featuring custom CRT scanlines and matrix-glow effects to immerse the player in a hacker's environment. 🧠 What I Learned Building this project taught me the nuances of LLM-based Evaluation. Creating a "Judge" that can detect encoded secrets (like Base64 or ROT13) without being tricked itself was a fascinating prompt engineering challenge. 🚧 Challenges Faced The biggest hurdle was balancing the game difficulty. I had to ensure that the higher-level guardians were resistant to simple requests but vulnerable to the "smuggling" and "logic bypass" techniques players unlock in the shop. This required rigorous iterative testing of the Gemini 3 Flash model's reasoning capabilities.

Built With

Share this project:

Updates