1. Gemini 3 Family Integration: From Pixels to Purchases
ViShop is powered by a high-performance orchestration of the Gemini 3 family. At its heart lies "Agentic Vision" powered by Gemini 3 Flash. This dedicated vision engine allows the application to capture the active video scene and instantly detect all shoppable products within the frame. It bridges the gap between temporal video content and static product analysis by correlating timestamps with visual identification.
The brain of the system is the Commerce Expert Agent, powered by Gemini 3 Pro and built on the Google ADK. Leveraging Gemini 3's superior multimodal reasoning, the agent performs real-time searches, verifies products via the video’s context, and acts as a UCP Platform. Users can engage in natural dialogue—asking about a product’s features or style—and the agent can pinpoint the exact moment it appeared in the video to validate the purchase. Thanks to its native multimodality, Gemini 3 is the only model capable of seamlessly translating a visual spark of inspiration into a controlled Universal Commerce Protocol (UCP) payment flow.
2. Technical Disclaimer: The UCP Vision
ViShop is a technical Proof of Concept designed to demonstrate how Agentic Vision and UCP can revolutionize the retail industry. To showcase the end-to-end "one-click" vision, we have simulated certain UCP roles—specifically the Business (Merchant) and Credential Provider (Wallet) layers—using high-fidelity mock data.
In a production environment, this architecture would leverage official UCP Discovery mechanisms (Profiles and Agent Cards) to dynamically find and negotiate with real-world merchants. By combining Gemini 3's long-context multimodality with the standardized primitives of the Universal Commerce Protocol, ViShop eliminates the friction of traditional e-commerce, transforming a "I want this" thought into a completed transaction in seconds.
3. Inspiration: The Next-Gen Google Lens
Our inspiration was simple: What if Google Lens existed inside every pixel of your YouTube experience? We wanted to evolve the passive act of watching a video into an active, frictionless storefront. ViShop isn't just a tool; it's a demonstration of how Gemini 3 creates a future where the distance between digital inspiration and physical ownership is collapsed into a single, intelligent interaction.
Built With
- a2a
- a2ui
- adk
- docker
- fastapi
- gemin3
- plasmo
- react
- typescript
- ucp
Log in or sign up for Devpost to join the conversation.