Inspiration

Reading is magical, but it's all in your head. We wanted to bring it out, so when a book describes a spaceship or an alien, you can actually see it standing in your room. The goal from day one: build something that works with any book, not just one specific story.

What it does

We built it in Snap Lens Studio for Spectacles. Gemini Vision does the heavy lifting — Spectacles snaps a frame of the page every couple of seconds, sends it to Gemini, and gets back a keyword for what the page is about. From there, a small router picks the right 3D model and shows it. The palm button uses Snap's Spectacles Interaction Kit so you can pinch to toggle the relationship map on and off.

How we built it

Livi is an AR reading buddy for Snap Spectacles. As you read, it watches the page through the camera and figures out what's being described - a character, a vehicle, a place - then drops the matching 3D scene in front of you. Look at your palm and a floating relationship map of all the characters appears, anchored in your room so you can walk around it like a hologram. Swap in a different book and Livi just keeps working.

Challenges we ran into

Reading text from a moving camera is way harder than it sounds. We tried image markers first and they were too fragile on Spectacles, so we pivoted to AI vision instead. Even then, getting Gemini to give us reliable one-word answers without false positives on walls or random pages took a lot of prompt tweaking. Lens Studio also threw a bunch of small surprises at us - invisible text components, broken material imports, weird parenting behaviors. Honestly, most of our time was spent debugging the platform, not the idea itself. And designing UI that floats in space is a totally different beast from designing for a screen. Things kept ending up at the wrong scale or in the wrong place.

Accomplishments that we're proud of

Honestly, the biggest one is just that we shipped something at all. None of us had ever built XR before, and none of us had ever touched Lens Studio or Spectacles. We spent the first half of the hackathon just trying to figure out how to navigate the editor, where things lived, what was a "scene object" versus an "asset," and why our scripts kept silently failing. By the time we found our footing, we were way behind, but we caught up, and the final lens actually works. We're also proud that the whole experience runs hands-free with no menus or controllers, and that the architecture isn't locked to one book. The Snap Spectacles camera also wasn't the easiest to work with for catching small text on a page, so detection still misses sometimes, but that's a known limitation rather than a broken feature, and the experience holds up regardless.

What we learned

Lens Studio & Spectacles: navigating the editor, scene objects vs. assets, pushing to device TypeScript scripting: @input fields, lifecycle events, async camera + API calls 3D assets: GLB imports, fixing purple materials, relinking textures to PBR Gemini Vision: prompt design for single-word answers, latency, token management Interaction Kit: palm anchoring, PinchButton wiring, hand-tracking quirks

What's next for Livi

A book-picker so users can tell Livi what they're reading and the lens loads the right content. Auto-generating the relationship map from any book's text so we don't have to hand-build it for every title. Animated scenes instead of static models. Letting users tap a character on the map to see chapter notes or context. A shared mode so two people in the same room can explore the same map together. And eventually going beyond novels into textbooks, recipes, and manuals — anywhere a page describes something visual, Livi can bring it to life.

Built With

Share this project:

Updates