Superstore Vision

AI-Powered Retail Shelf Intelligence with Amazon Nova

Inspiration

Retail stores always face a major challenge in keeping shelves fully stocked and properly organized. Even large retailers still rely heavily on manual shelf audits, where employees walk through store aisles checking for empty slots, misplaced products, or incorrect shelf layouts.

These audits are often time-consuming, inconsistent, and may fail to detect issues early enough.

While exploring how multimodal AI can interpret real-world visual environments, we realized that a simple shelf photo contains a large amount of operational information. If AI could analyze shelf images and automatically detect stock gaps or planogram violations, it could significantly improve retail efficiency.

This insight led to Superstore Vision — a system that transforms ordinary shelf images into actionable retail intelligence using Amazon Nova.

What It Does

Superstore Vision is an AI-powered system that analyzes shelf images captured by store employees and automatically detects:

Empty shelf slots
Low-stock products
Planogram compliance issues
Restocking requirements

The system then generates clear operational instructions such as:

Restock 6 units of Tesco Lemonade in Aisle 2A.

This turns a simple shelf image into a real-time operational decision.

How We Built It

The system uses a modular architecture combining computer vision, multimodal AI, and decision logic.

Core Components

1. Shelf Image Ingestion

A store employee captures a shelf image and uploads it through a simple interface.

2. Multimodal Analysis with Amazon Nova

The uploaded image is sent to Amazon Nova multimodal models via Amazon Bedrock.

Nova analyzes the image and identifies:

Products present on the shelf
Empty shelf slots
Shelf positions

This converts visual information into structured data.

3. Planogram Verification

The detected shelf state is compared with a predefined planogram stored as structured JSON data.

A planogram represents the expected product arrangement and quantity per shelf.

Example structure:

{
  "aisle": "2A",
  "product": "Tesco Lemonade",
  "expected_units": 10,
  "current_units": 4
}

4. Inventory Intelligence

The system checks a lightweight inventory database to determine whether backstock is available for the missing product.

5. Decision Engine

A decision layer determines the correct action based on analysis results:

Restock from backstock
Flag a shelf issue
Identify potential shrinkage

Example logic:

if current_units < expected_units:
    action = "Restock from backstock"

6. Task Generation

The system generates clear operational instructions for store employees.

Example output:

Restock 6 units of Tesco Lemonade in Aisle 2A

Technology Stack

AI

Amazon Nova Lite (Multimodal)
Amazon Bedrock

Backend

Python
FastAPI

Processing

Pillow
OpenCV

Data

SQLite Database
JSON-based planogram

Infrastructure

What We Learned

Building Superstore Vision taught us several important lessons about multimodal AI and real-world system design.

First, we learned how multimodal models like Amazon Nova can convert visual inputs into structured operational insights. This capability opens many possibilities beyond traditional text-based AI systems.

Second, we learned how to connect computer vision models with business logic. Detecting objects in an image is only the first step — the real value comes from translating that information into meaningful operational actions.

Finally, we learned how to design a modular AI architecture that integrates vision analysis, inventory data, and decision logic into a single workflow.

Challenges We Faced

One of the biggest challenges was designing a system that could reliably interpret shelf images and convert results into structured data suitable for operational decisions.

Another challenge was building the decision layer that connects AI analysis with retail workflows. Detecting a missing product is useful, but the system must also determine whether the item should be restocked or flagged for further investigation.

We also focused on creating a system architecture that could realistically scale in real retail environments without requiring expensive hardware such as dedicated shelf cameras.

Impact

Retail stores lose billions of dollars annually due to out-of-stock products and inefficient shelf monitoring.

Superstore Vision demonstrates how multimodal AI can:

Reduce manual shelf audits
Improve shelf availability
Detect stock issues faster

Unlike traditional monitoring systems that rely on specialized cameras or sensors, this solution works using simple shelf images captured by employees.

By turning shelf photos into actionable insights, Superstore Vision enables retailers to respond faster to inventory problems and improve operational efficiency.

Future Improvements

Future versions of Superstore Vision could include:

📱 Real-time mobile apps for store employees
🎙 Voice instructions using Amazon Nova Sonic
🤖 Automated UI workflows with Nova Act
🔗 Integration with live retail inventory systems
🏬 Real-time shelf monitoring across multiple stores

Conclusion

Superstore Vision demonstrates how multimodal AI can bridge the gap between physical retail environments and intelligent automation.

By combining shelf image analysis, planogram verification, and inventory intelligence, the system transforms simple photos into operational decisions that improve retail efficiency.

Built With

Updates

Athul Jayakumar started this project — Mar 16, 2026 06:17 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.