Superstore Vision
AI-Powered Retail Shelf Intelligence with Amazon Nova
Inspiration
Retail stores always face a major challenge in keeping shelves fully stocked and properly organized. Even large retailers still rely heavily on manual shelf audits, where employees walk through store aisles checking for empty slots, misplaced products, or incorrect shelf layouts.
These audits are often time-consuming, inconsistent, and may fail to detect issues early enough.
While exploring how multimodal AI can interpret real-world visual environments, we realized that a simple shelf photo contains a large amount of operational information. If AI could analyze shelf images and automatically detect stock gaps or planogram violations, it could significantly improve retail efficiency.
This insight led to Superstore Vision — a system that transforms ordinary shelf images into actionable retail intelligence using Amazon Nova.
What It Does
Superstore Vision is an AI-powered system that analyzes shelf images captured by store employees and automatically detects:
- Empty shelf slots
- Low-stock products
- Planogram compliance issues
- Restocking requirements
The system then generates clear operational instructions such as:
Restock 6 units of Tesco Lemonade in Aisle 2A.
This turns a simple shelf image into a real-time operational decision.
How We Built It
The system uses a modular architecture combining computer vision, multimodal AI, and decision logic.
Core Components
1. Shelf Image Ingestion
A store employee captures a shelf image and uploads it through a simple interface.
2. Multimodal Analysis with Amazon Nova
The uploaded image is sent to Amazon Nova multimodal models via Amazon Bedrock.
Nova analyzes the image and identifies:
- Products present on the shelf
- Empty shelf slots
- Shelf positions
This converts visual information into structured data.
3. Planogram Verification
The detected shelf state is compared with a predefined planogram stored as structured JSON data.
A planogram represents the expected product arrangement and quantity per shelf.
Example structure:
{
"aisle": "2A",
"product": "Tesco Lemonade",
"expected_units": 10,
"current_units": 4
}
4. Inventory Intelligence
The system checks a lightweight inventory database to determine whether backstock is available for the missing product.
5. Decision Engine
A decision layer determines the correct action based on analysis results:
- Restock from backstock
- Flag a shelf issue
- Identify potential shrinkage
Example logic:
if current_units < expected_units:
action = "Restock from backstock"
6. Task Generation
The system generates clear operational instructions for store employees.
Example output:
Restock 6 units of Tesco Lemonade in Aisle 2A
Technology Stack
AI
- Amazon Nova Lite (Multimodal)
- Amazon Bedrock
Backend
- Python
- FastAPI
Processing
- Pillow
- OpenCV
Data
- SQLite Database
- JSON-based planogram
Infrastructure
- AWS
What We Learned
Building Superstore Vision taught us several important lessons about multimodal AI and real-world system design.
First, we learned how multimodal models like Amazon Nova can convert visual inputs into structured operational insights. This capability opens many possibilities beyond traditional text-based AI systems.
Second, we learned how to connect computer vision models with business logic. Detecting objects in an image is only the first step — the real value comes from translating that information into meaningful operational actions.
Finally, we learned how to design a modular AI architecture that integrates vision analysis, inventory data, and decision logic into a single workflow.
Challenges We Faced
One of the biggest challenges was designing a system that could reliably interpret shelf images and convert results into structured data suitable for operational decisions.
Another challenge was building the decision layer that connects AI analysis with retail workflows. Detecting a missing product is useful, but the system must also determine whether the item should be restocked or flagged for further investigation.
We also focused on creating a system architecture that could realistically scale in real retail environments without requiring expensive hardware such as dedicated shelf cameras.
Impact
Retail stores lose billions of dollars annually due to out-of-stock products and inefficient shelf monitoring.
Superstore Vision demonstrates how multimodal AI can:
- Reduce manual shelf audits
- Improve shelf availability
- Detect stock issues faster
Unlike traditional monitoring systems that rely on specialized cameras or sensors, this solution works using simple shelf images captured by employees.
By turning shelf photos into actionable insights, Superstore Vision enables retailers to respond faster to inventory problems and improve operational efficiency.
Future Improvements
Future versions of Superstore Vision could include:
- 📱 Real-time mobile apps for store employees
- 🎙 Voice instructions using Amazon Nova Sonic
- 🤖 Automated UI workflows with Nova Act
- 🔗 Integration with live retail inventory systems
- 🏬 Real-time shelf monitoring across multiple stores
Conclusion
Superstore Vision demonstrates how multimodal AI can bridge the gap between physical retail environments and intelligent automation.
By combining shelf image analysis, planogram verification, and inventory intelligence, the system transforms simple photos into operational decisions that improve retail efficiency.
Log in or sign up for Devpost to join the conversation.