Museum's Wit

Inspiration

Art galleries are simple. Walk in, look at art, read placard, repeat.

You'd think that'd be engaging enough to stick with. But then it's Room 7 of "trying to appreciate cultural significance," and you're facing your fifteenth portrait of some historical figure, with the same sterile description, feeling the same creeping gallery fatigue. You start to periodically check your phone for when its socially acceptable to leave. You start bargaining with yourself around painting twenty-three: "If I just pretend to understand the significance of this brushwork, I can skip the rest of this wing, find the gift shop, and pretend I had a profound experience." That’s where Museum's Wit comes into help, it’s your time, might as well just enjoy it.

Standing in front of Banksy's "Love is in the Bin" at Sotheby's, I watched people snap quick photos and move on, missing the revolutionary story behind it - a painting that literally self-destructed the moment it sold for $1.4 million, transforming from street art into a radical statement about commercialism, value, and the art market itself. Without that context, it's just half a shredded picture of a girl with a balloon.

Great art isn't just about what you see - it's about the stories, scandals, and subversions behind the canvas. Museum's Wit turns your Snapchat Spectacles into that irreverent friend who knows all the juicy details, delivering them right when you need them, without you having to bury your face in your phone or an audio guide.

What it does

Simple: You look at art and click a button. Museum's Wit does the rest.

When you're wearing Snapchat Spectacles and looking at artwork in a museum, simply press the virtual button on the glasses frame. We instantly capture what you're seeing and use computer vision API’s in google Gemini to identify the artwork. Within seconds, it displays the essential story behind the piece - not just academic facts, but the compelling narrative that makes it worth understanding.

The information appears in your field of vision through an AR overlay, showing you:

The transformative story behind the artwork's creation
Historical context that might change how you see the piece
Artist's intentions and the piece's cultural impact
Visual highlights of key details you might otherwise miss

All of this happens without you needing to look away from the artwork itself. Just look, click, and suddenly understand why this piece matters.

How we built it

Running on Monster Energy, hot Doritos, and good spirit, we implemented Museum's Wit using Snap’s Lens Studio. We built it with TypeScript, and utilized geminiAI API for creative measures.

Our system works in three main steps:

Image Capture: Using Snapchat Spectacles' button interface and camera, we capture what the user is looking at when they press the button.
Artwork Recognition & Story Generation: The captured image is sent to our Node.js backend server, which processes it using the Google Gemini 2.0 Flash model. We've crafted a specialized prompt that instructs Gemini to:
- Identify if the image contains a recognized artwork
- If recognized, provide the fascinating historical story behind it
- Focus on why it was created and what made it revolutionary
AR Presentation: The response from Gemini is displayed back to the user through Lens Studio's AR interface, positioned to complement rather than obstruct the artwork itself.

Our server uses Express.js to handle image uploading and API communication, with optimizations to ensure quick response times essential for maintaining user engagement.

Screenshot 2025-04-27 at 3.54.31 AM.png

The flowchart above illustrates the complete process from the moment a user looks at artwork to receiving the contextual information through AR overlay. The system spans three main components: the Snapchat Spectacles hardware/Lens Studio environment, our custom Node.js backend server, and the Google Gemini API that powers our artwork recognition and storytelling.

Challenges we ran into

First-time AR Development: None of us had prior experience with Lens Studio or AR development. Coming primarily from web development backgrounds, the spatial thinking required for AR was a significant learning curve. One team member's experience with Unity helped bridge some knowledge gaps, but Lens Studio's specific workflows and TypeScript integration still required substantial adaptation.
Lens Studio: Working with Lens Studio really pushed us to quickly learn a completely different development environment and paradigm. The combination of 3D positioning, AR overlays, and integrating with Spectacles hardware was far outside our usual tech comfort zones. We spent the first 16 hours just getting familiar with the basics while watching tutorials and experimenting.
Balancing API Response Length: Getting Gemini to provide responses that were informative yet concise enough for AR display required significant prompt engineering.
Crafting the Right Voice: We wanted our companion to have personality while still delivering accurate art information. This required careful balancing in our prompt design.
Latency Reduction: The round-trip from image capture to API response to AR display initially took too long for a seamless experience. We optimized our server configuration and response handling to minimize this delay.
Improving Visuals: Creating AR visuals that enhanced rather than distracted from the artwork was a constant challenge. We went through multiple design iterations to find the right balance of information presentation and visual subtlety.
Creating a Non-Obtrusive Experience: We struggled to design an interface that enhanced the museum experience without turning it into a "game" or technical challenge. Our goal was to make the technology invisible, allowing users to focus on the art while gaining deeper understanding and appreciation.

Accomplishments that we're proud of

New Tech: Despite none of us having prior experience with Lens Studio or AR development, we successfully created a working AR application for Snapchat Spectacles that delivers real value to users.
Effective API Integration: We used the Google Gemini API for the first time ever with our application which was a big learning goal for this Hackathon.
Server setup: We built a Express back-end server that handles image processing, API communication, and response formatting—all critical components for our application's functionality, and provided a place where we could tweak and edit things on the fly such as what text would be returned without having to edit the lens code.
Rapid MVP Development: We went from concept to working prototype really quick, despite the steep learning curve of new technologies.
Creating a Distinctive Voice: We developed a unique, engaging personality for our art companion that makes learning about art history entertaining rather than dry and academic.
Ease of Use: We created a seamless one-button interface that makes advanced technology accessible to users with no technical knowledge or training.
Building for Social Impact: We developed an application that genuinely enhances cultural experiences and makes art more accessible and engaging for all visitors, regardless of their prior art knowledge.

What we learned

Pair Programming Works: We found that tackling complex challenges together at one computer was significantly more effective than working separately. When integrating our webserver with our AR display system, having two sets of eyes on the code helped us catch errors immediately and brainstorm better solutions on the spot.
How to Split the Work: We learned to be specific about responsibilities. Instead of vague "you handle the backend" assignments, we created clear ownership: "Mohareb is responsible for the Gemini API integration, Abdullah owns the AR display logic, and Abdulaziz manages the image capture implementation ." This clarity eliminated confusion and prevented duplicate work.
Prediction Haki is Important: When testing with a museum's spotlighting, we discovered our image recognition struggled with glare. Had we anticipated this common museum condition earlier, we wouldn't have spent three days rebuilding our image normalization pipeline two days before the demo.
Documentation Saves Lives: We wasted a couple hours trying to debug why our API calls weren't working, only to discover Lens Studio image capture was sending base64 encoded images instead of raw files. Having better references on hand would have solved this in minutes instead of hours.
Test with Real Users: Our initial interface design looked beautiful to us but confused our first test user completely. Watching real people interact with our app revealed that our "intuitive" button placement was anything but, leading to a simplified design that actually worked.

What's next for Museum's Wit

While the ultimate goal still remains to enhance the experience of artwork, in museums and otherwise, the next still for Museum's Wit is to launch on Lens Studio.