๐Ÿš€ Inspiration

Retail shelf auditing and cataloging is a repetitive, manual task that wastes hours of human effort in large-scale inventory systems. We were inspired by the idea of automating this process using computer vision and LLMs โ€” turning shelf images into structured product data in seconds. With this, we aim to empower small and large retailers with AI-driven smart cataloging, boosting efficiency, discoverability, and SEO performance.


๐Ÿ›  What it does

Our web app allows users to:

  • Upload a photo of a retail shelf
  • Automatically segment and identify products using YOLOv11-Nano
  • Send cropped product images to Gemini API to generate names, descriptions, and SEO-optimized categories
  • Display and export product data (CSV) through a visually stunning UI
  • Maintain per-user history of detections via Supabase (authentication + storage)

All of this happens in under 8 seconds, processing 40+ items with high accuracy.


๐Ÿ— How we built it

  • Frontend: Built with React, TypeScript, Vite, styled with TailwindCSS and Framer Motion for animations
  • Backend: Flask API running on Render, integrated with YOLOv11-Nano for real-time product detection
  • LLM Integration: Used Gemini API to describe and classify products
  • Auth & Storage: Supabase for authentication and persistent user-specific data
  • UX Focus: Custom cursor, scroll and hover animations, light/dark themes, and smooth transitions throughout the interface

๐Ÿงฑ Challenges we ran into

  • Achieving fast and accurate object detection at scale โ€” YOLOv5 was too slow, so we migrated to YOLOv11-Nano for speed
  • Handling multiple API integrations (CV + LLM) and ensuring consistency in data
  • Managing CORS issues and file uploads in a seamless way across platforms
  • Designing a modern UX thatโ€™s both beautiful and functional under time pressure
  • Parsing and structuring LLM responses into clean, SEO-optimized metadata

๐Ÿ† Accomplishments that we're proud of

  • Built a full-stack CV + LLM-powered system in a short time
  • Achieved scalable inference speeds (8s for 40+ products)
  • Created a delightful user experience with scroll-based animations and dynamic feedback
  • Implemented secure login, history tracking, and export capabilities for user data
  • Designed a system that can be directly useful for ecommerce platforms or retail digitization

๐Ÿ“š What we learned

  • How to integrate and fine-tune CV + LLM pipelines for real-world applications
  • Techniques for optimizing inference speed (model selection, preprocessing)
  • The power of prompt engineering when describing products visually
  • Best practices for managing async API flows and state in React
  • How to turn AI output into business-ready, structured, and SEO-aligned content

๐Ÿ”ฎ What's next for Ecommerce Shelf Automation with CV

  • SEO features: Generate keyword-optimized metadata and titles
  • Platform Integration: One-click publish to Amazon, Shopify, etc.
  • Video Compatibility: Analyze videos to detect items frame-by-frame
  • Multilingual Descriptions: Expand product reach globally
  • Batch Uploads & Analytics: For enterprise-level retail insights

Built With

Share this project:

Updates