Inspired by Tony Stark’s hologram scene, we wanted to build something that makes spatial interaction feel natural, like you can just reach out and use it, without needing a headset or fancy controllers. Right from the start, we hooked in the Gemini Live API to power “Jarvis,” our voice assistant that can actually answer questions about the designs you’re working on. Think of it like having an AI lab partner who knows what you’re looking at and can explain or give extra context. From there, we set out to make the actual interaction feel as smooth as possible. The idea was simple: reach into the screen, pinch to grab, spread to scale, rotate to inspect, and then just ask Jarvis about what you’re seeing. To do that, we turned a regular webcam into our sensor and used MediaPipe Hands + OpenCV for real-time tracking and rendering. On the graphics side, we built a full 3D pipeline for meshes—loading, projecting, lighting, and transforming them in real time. Gestures come straight from hand landmark geometry: pinching grabs, spreading scales, and using both hands lets you rotate things smoothly. Jarvis sits on top of all that, listening for questions and giving answers based on the model’s metadata and the current scene. The tricky parts were stability and ambiguity since we didn’t have real depth. Hands jitter, sometimes overlap, and it’s super easy to accidentally pick the wrong object. We ended up smoothing motion over time, adding “snap zones” to make selection less frustrating, and tuning gesture thresholds when detection got noisy. We also auto-normalize OBJ files for scale and centering, fall back to wireframe mode if things get too heavy, and run a quick calibration step so lighting and skin tones don’t throw off recognition. Next steps: we want to layer in SLAM for true 3D anchoring, take advantage of depth sensors when available, and expand Jarvis with retrieval so it can cite sources and walk you through processes, not just answer quick questions.
MLH tracks: Gemini
Intended track: AR/VR, CV, Creative

Log in or sign up for Devpost to join the conversation.