💡 Inspiration
In the retail world, data is abundant, but action is scarce. Store managers have dashboards showing how many people are waiting, but they rarely get immediate answers on why or what to do next.
I wanted to move beyond passive analytics. Inspired by the Gemini 3 Action Era, I set out to build an Autonomous Retail Agent—one that doesn't just display numbers but understands context, sees the floor through vision, and acts like an experienced manager.
🤖 What it does
IntelliQueue AI is a hybrid reasoning engine that optimizes retail operations in real-time. It acts as a "Marathon Agent," continuously monitoring and self-correcting.
- Predictive Analytics: Calculates wait times based on crowd density and staffing.
- Visual Intelligence: Uses Gemini 1.5 Flash (Vision) to analyze CCTV snapshots, detecting crowd mood and verifying data accuracy.
- Strategic Reasoning: Instead of just outputting "20 mins wait," it diagnoses the root cause (e.g., "Staff Shortage" or "Rainy Weather") and generates an immediate action plan.
- Self-Correction Loop: The agent accepts feedback. If a prediction is wrong, it autonomously recalibrates its internal logic for the next hour.
⚙️ How I built it
I architected a Hybrid Neuro-Symbolic System that merges Classical Machine Learning with Large Language Models.
The "Calculator" (Deterministic ML): I used
Scikit-Learnto build a Random Forest Regressor. This model handles the numerical heavy lifting, predicting base wait times based on historical data patterns. $$Wait_{base} = f(Staff, Crowd, Hour, Day)$$The "Brain" (Probabilistic AI): I integrated Google Gemini 1.5 Flash via the API. The system feeds the ML prediction + Environmental Context + CCTV Image into Gemini. The LLM then acts as the "Reasoning Layer," applying context multipliers and generating strategic advice.
The Interface: Built entirely on Streamlit, featuring a custom "Time Simulation Slider" that allows judges to test the agent's response to different times of day (e.g., simulating a 2 PM rush hour while testing at midnight).
🧠 Challenges I ran into
- Temporal Dissonance: Since I was building this during off-hours, testing "Peak Time" logic was hard. I had to engineer a prompt system that forced Gemini to treat the simulated time (from the slider) as the ground truth, rather than the server's actual timestamp.
- Multimodal hallucinations: Sometimes the AI would "see" things in the CCTV image that contradicted the data. I solved this by implementing a strict prompt hierarchy where numerical data acts as the primary constraint for the vision model.
🏅 Accomplishments that I'm proud of
- The Feedback Loop: Building the "Marathon Agent" feature where the user can tell the AI "You were wrong," and the AI autonomously runs a self-correction protocol to adjust its parameters.
- Solo Development: As an aspiring Data Scientist, I successfully handled the full stack—from the ML backend to the Streamlit frontend and the GenAI integration—entirely on my own.
🧪 What I learned
This project taught me that Hybrid AI is the future. LLMs are powerful, but they can be inconsistent with math. By grounding Gemini with a Random Forest model, I achieved the best of both worlds: Mathematical Accuracy + Human-like Reasoning.
🚀 What's next for IntelliQueue
- Integration with live CCTV RTSP feeds (instead of static uploads).
- Fine-tuning a smaller Gemini model specifically for retail logistics to reduce latency.
- Expanding the "Self-Correction" database to permanently store learned patterns.
Built With
- computer-vision
- google-gemini
- machine-learning
- pandas
- python
- scikit-learn
- streamlit
Log in or sign up for Devpost to join the conversation.