Inspiration

At this year's HackaTUM, we noticed a clear trend: almost every project is racing to build products with AI. But while the development cycle focuses on increasing AI capabilities, we couldn't ignore the reality of safety. We’ve seen how easily unchecked AI can be weaponized against vulnerable populations without guardrails or accountability.

While many focused on capabilities, we focused on accountability. We developed a basic but hybrid watermarking system which could be used to fingerprint and track AI-generated content helping us maintain the ability to distinguish machine output from human creation.

How we built it

We built it by drawing Inspiration from standard watermarking techniques in Open Computer Vision and by looking at Google's proprietary and open-source ideas for Synth-ID. For the image watermarking we used a simple Fourier Fast Transform and then injected noise.

🧠 The Concept

The game is simple: Waldo hides a secret signal in the noise of AI outputs, and Where's Waldo tries to find it.

Challenges we ran into

From the project development perspective: everyone in the group was an early adopter of AI, however we had almost no experience working in the realm of AI safety and understandability.

We struggled quite a lot with getting the core examples up and running. This was mainly because we wanted to run and host our own model. It eventually paid off as it gave us a lot of room to experiment and try out different configurations and options.

Accomplishments that we're proud of

After adding an additional cheeky watermark to the text, we detect AI text almost instantly. Additionally after some model configuration, we found out that often-times the generated image looks indistinguishable from the watermarked image. Finally, the watermaked image is resilient against adding external noise.

What we learned

Although LLMs and Generative Models are new on the block, many techniques and methods developed in sister branches are still relevant and offer extremely competitive solutions. For example the image watermarking technique we used was extremely basic, however is still resilient against adding moderate amount of additional noise and to a certain level of image skewing.

What's next for Waldo

We would love to extend our open source generative models and detectors. Feel free to reach out to any of the project members with ideas or suggestions or simply open an issue on the git project.

Built With

Share this project:

Updates