Inspiration
Everyone has experienced the "Moving Day Nightmare": standing in a new, empty bedroom with a heavy dresser, only to realize it blocks the heater or prevents the door from opening. Moving is ranked as one of life’s most stressful events, largely due to spatial anxiety—the fear that your physical life won't fit into your new chapter.
I wanted to build a tool that replaces the tape measure and "gut feelings" with Spatial Intelligence. With the release of Gemini 3, we finally have a model capable of understanding three-dimensional relationships from a simple video. Vesta G3 was born from a simple question: What if your home could move itself digitally before you ever pack a box?
What it does
Vesta G3 is an autonomous spatial transition agent. The user simply uploads two videos: one of their current furnished home and one of the empty new space.
- The Inventory Agent identifies and measures every piece of furniture with zero manual input.
- The Architect Agent maps the new room, identifying "invisible" constraints like power outlets, windows, and door swings.
- The Placement Agent then "moves" the inventory into the new space, generating optimized 3D layouts and a "Fit-Check" Report that warns you if a sofa is 2 inches too long or if a bed blocks a vent.
How we built it
- Prototyping: We used Google AI Studio to refine the spatial reasoning prompts, utilizing Gemini 3’s Deep Think mode to ensure accuracy in dimension estimation.
- Orchestration: The entire application was built in Antigravity. This allowed us to manage a three-agent architecture (Inventory, Architect, and Placement) where each agent passes stateful data to the next via Gemini 3’s 1M+ token context window.
- Visuals: We integrated Veo 3.1 via the Gemini API to generate high-fidelity video previews of the furnished new home, giving users a "walking tour" of their future space.
Challenges we ran into
The biggest hurdle was Spatial Scaling. Estimating the exact size of a table from a shaky smartphone video without a physical reference (like a coin or ruler) is incredibly difficult. We solved this by instructing the Inventory Agent to use "environmental benchmarks"—using the standard height of electrical outlets and door frames to triangulate the dimensions of nearby furniture.
Another challenge was Object Permanence. If a user panned away from a chair and then back to it, we had to ensure the model didn't count it as two different chairs. We overcame this by utilizing Gemini 3’s Thought Signatures to maintain a persistent "World Model" throughout the video processing.
Accomplishments that we're proud of
We are incredibly proud of the "Fit-Check" Reasoning Trace. It doesn't just say "The bed fits"; it explains why it should go on the North wall to avoid morning window glare and ensure access to the primary power outlet. Seeing an AI provide ergonomic advice that actually makes a room feel more "breathable" felt like a true breakthrough in embodied AI.
What we learned
This project taught us that Multimodal AI has moved beyond simple labeling. In 2026, Gemini 3 isn't just seeing pixels; it’s understanding volume, clearance, and human utility. We also learned that Agentic Workflows are far superior to traditional coding for spatial tasks—letting the agents "debate" the best layout produced much more creative results than a hard-coded algorithm ever could.
What's next for Vesta G3
- Marketplace Integration: If an item won't fit, Vesta G3 will automatically draft a listing for Facebook Marketplace or suggest a replacement from IKEA that does fit.
- AR Live-View: Integrating with 2026-era AR glasses so users can "see" their old furniture in the empty room in real-time.
- B2B Expansion: Partnering with moving companies to provide automated, high-accuracy quotes based on the visual inventory.
Built With
- gemini3
- genai
- html5
- json
- react19
- tailwindcss
- typescript
- veo3.1

Log in or sign up for Devpost to join the conversation.