Inspiration
I realized that e-commerce is broken because it relies entirely on text.
- If a pipe bursts under your sink, you don't know the name of the specific wrench you need.
- If you see a cool jacket, you don't know if it fits your face shape.
- If you want a discount, you can't negotiate with a static "Add to Cart" button.
I wanted to build an Autonomous Shopkeeper an AI that doesn't just "search" but actually sees your problem, reasons about the solution, and even negotiates the price with you like a human.
What it does
ShopLens AI is a multimodal reasoning agent that transforms a standard online store into an intelligent consultancy. It features 5 distinct "Agent Modes":
- The Mechanic (Repair Mode): Users upload a video of a broken object (e.g., a leaking pipe). The AI diagnoses the mechanical failure and maps the solution to specific tools in our inventory.
- The Stylist (Fashion Mode): Analyses the user's face shape and skin tone from a selfie to recommend products that actually look good (e.g., "Round frames for your Square face").
- The Gift Scout: Users upload a screenshot of a friend's Instagram grid. The AI infers their hobbies and "vibe" to suggest the perfect gift.
- The Designer (Decor Mode): Reads room aesthetics (minimalist, industrial) to suggest matching furniture.
- HaggleAI: A real-time negotiation engine. Users can debate the price with the AI, which has a hidden "floor price" and gets "annoyed" at lowball offers.
How i built it
We built this as a Dual-Layer Architecture:
- The "Brain" (New Code): I built a completely new Node.js/Express AI layer specifically for the Gemini 3 Hackathon. This handles the complex System Prompting, Multimodal Vision processing, and the "Personality State" for the negotiation logic.
- The "Body" (Base Infrastructure): I connected this agent to an existing MERN stack e-commerce boilerplate (MongoDB/React) to simulate a real-world inventory. The AI Agent "pilots" this database, searching and filtering products via JSON function calling.
Gemini Implementation: I utilized the Gemini 3 Flash model for its incredibly fast and stable Multimodal Vision capabilities (analyzing user photos). I also designed the negotiation logic to leverage the advanced reasoning of Gemini 3 Flash, allowing the AI to maintain complex "deal states" and remember price floors during a haggle session.
Challenges I ran into
The "Preview" Instability:
I originally designed the system for the gemini-3-flash-preview. However, during testing, I hit frequent 503 Overload Errors.
- The Fix: I engineered a "Model Toggle" in our backend. The live demo gracefully falls back to the stable
gemini-2.5-flashmodel when the V3 API is unresponsive, ensuring the judges always see a working product.
Hallucination in Negotiation: Early versions of HaggleAI would sometimes sell a $500 item for $1 just because the user was polite.
- The Fix: I implemented "System Instruction Guardrails" that treat the
floor_priceas an immutable law, forcing the AI to use "Emotional Rejection" logic instead of caving in.
Accomplishments that I'm proud of
- I successfully made the AI "See" abstract concepts (like "Vibe" or "Face Shape") and translate them into concrete Database Queries.
- The HaggleAI feels incredibly human it's genuinely fun to try and outsmart the bot for a discount.
What's next for ShopLens AI
- AR Integration: Letting you "try on" the glasses immediately after the AI suggests them.
- Voice Mode: Negotiating the price out loud instead of typing.
Built With
- express.js
- gemini3-api
- google-gemini
- javascript
- mongodb
- node.js
- react
- vision-api

Log in or sign up for Devpost to join the conversation.