๐ Inspiration
Retail shelf auditing and cataloging is a repetitive, manual task that wastes hours of human effort in large-scale inventory systems. We were inspired by the idea of automating this process using computer vision and LLMs โ turning shelf images into structured product data in seconds. With this, we aim to empower small and large retailers with AI-driven smart cataloging, boosting efficiency, discoverability, and SEO performance.
๐ What it does
Our web app allows users to:
- Upload a photo of a retail shelf
- Automatically segment and identify products using YOLOv11-Nano
- Send cropped product images to Gemini API to generate names, descriptions, and SEO-optimized categories
- Display and export product data (CSV) through a visually stunning UI
- Maintain per-user history of detections via Supabase (authentication + storage)
All of this happens in under 8 seconds, processing 40+ items with high accuracy.
๐ How we built it
- Frontend: Built with React, TypeScript, Vite, styled with TailwindCSS and Framer Motion for animations
- Backend: Flask API running on Render, integrated with YOLOv11-Nano for real-time product detection
- LLM Integration: Used Gemini API to describe and classify products
- Auth & Storage: Supabase for authentication and persistent user-specific data
- UX Focus: Custom cursor, scroll and hover animations, light/dark themes, and smooth transitions throughout the interface
๐งฑ Challenges we ran into
- Achieving fast and accurate object detection at scale โ YOLOv5 was too slow, so we migrated to YOLOv11-Nano for speed
- Handling multiple API integrations (CV + LLM) and ensuring consistency in data
- Managing CORS issues and file uploads in a seamless way across platforms
- Designing a modern UX thatโs both beautiful and functional under time pressure
- Parsing and structuring LLM responses into clean, SEO-optimized metadata
๐ Accomplishments that we're proud of
- Built a full-stack CV + LLM-powered system in a short time
- Achieved scalable inference speeds (8s for 40+ products)
- Created a delightful user experience with scroll-based animations and dynamic feedback
- Implemented secure login, history tracking, and export capabilities for user data
- Designed a system that can be directly useful for ecommerce platforms or retail digitization
๐ What we learned
- How to integrate and fine-tune CV + LLM pipelines for real-world applications
- Techniques for optimizing inference speed (model selection, preprocessing)
- The power of prompt engineering when describing products visually
- Best practices for managing async API flows and state in React
- How to turn AI output into business-ready, structured, and SEO-aligned content
๐ฎ What's next for Ecommerce Shelf Automation with CV
- SEO features: Generate keyword-optimized metadata and titles
- Platform Integration: One-click publish to Amazon, Shopify, etc.
- Video Compatibility: Analyze videos to detect items frame-by-frame
- Multilingual Descriptions: Expand product reach globally
- Batch Uploads & Analytics: For enterprise-level retail insights
Built With
- flask
- javascript
- opencv
- python
- react
- ultralytics
- yolo
Log in or sign up for Devpost to join the conversation.