Inspiration
In the rapidly evolving landscape of modern logistics, the demand for efficient, autonomous warehouse systems is higher than ever. The inspiration for this project stemmed from a desire to bridge the gap between high-level, human-like reasoning and low-level, deterministic robotic control. We wanted to explore how a Large Language Model—specifically Google's Gemini—could act as the "brain" of an Automated Storage and Retrieval System (ASRS), translating abstract mission directives into actionable, precise robotic trajectories within a dynamic environment.
What it does
This project is a full-stack web simulation of an ASRS warehouse. It features an autonomous robot that navigates a grid-based environment to perform tasks such as charging, inventory checks, and transporting goods. The system uses the Gemini API to interpret high-level human instructions (e.g., "Transport goods to the packing station") and converts them into specific mission objectives. The robot then utilizes a custom pathfinding engine to navigate the warehouse floor, avoiding obstacles and managing its battery life, all while providing real-time visualization of its status and simulated LIDAR-based obstacle detection.
How we built it
The project was built using a robust, modular architecture: Frontend & Visualization: A React and TypeScript-based application provides the user interface. We used the HTML5 Canvas API for high-performance rendering of the warehouse grid, the robot's movement, and its LIDAR raycasting. Mission Planning (The Brain): The Gemini API acts as the intelligent layer. It parses natural language commands and spatial data to determine the appropriate mission type and target destination (e.g., mapping "Charging Dock" to specific coordinates). Pathfinding Engine (The Cerebellum): We implemented the A* search algorithm to calculate the most efficient path for the robot. The algorithm uses the Manhattan distance heuristic. Simulation Engine: A custom physics and state engine manages the robot's kinematics, battery consumption, and smooth interpolation between grid cells.
Challenges we ran into
The most significant technical hurdle was bridging the gap between our discrete, grid-based pathfinding algorithm and the continuous coordinate system required for smooth, real-world robot movement. Initially, our A* implementation was strictly integer-based. This caused immediate failures when the robot needed to return to its "home" position, which was located at the center of a walkway at continuous coordinates . Because the algorithm evaluated discrete grid nodes, it could never exactly match a non-integer goal, causing the open list to exhaust itself without finding a path. To resolve this, we had to decouple the heuristic target from the physical target. We modified the pathfinding logic to calculate the heuristic using a rounded integer goal (goalInt):
This allowed the A* algorithm to successfully navigate to the nearest valid integer grid cell. Once the path was generated, we appended the final, precise continuous coordinate to the trajectory, allowing the robot to smoothly glide to its exact resting spot.
Accomplishments that we're proud of
Seamless AI-Robotics Integration: We successfully created a pipeline where natural language input directly influences deterministic robotic behavior without sacrificing safety or precision. Robust Pathfinding: Our modified A* algorithm effectively handles both standard grid-based targets and non-integer coordinates, ensuring the robot can always return to its home base or navigate to precise sub-grid locations. Interactive Simulation: We built a fully functional, responsive simulation environment that provides real-time feedback on the robot's position, battery status, and sensor data (including a visual LIDAR sweep).
What we learned
Hybrid System Design: We gained deep insights into the complexities of integrating high-level AI reasoning with low-level deterministic algorithms. LLMs are fantastic for orchestration, but they must be paired with strict, math-based engines for physical execution. Coordinate System Alignment: We learned that standard algorithms often require custom adaptations to handle the specific constraints of a real-world simulation, such as mapping continuous space onto a discrete grid. Simulation Fidelity: We realized that even in a 2D model, details like pathfinding precision, raycasted sensor simulation, and battery management are crucial for creating a realistic and useful digital twin.
What's next for Proposal of ASRS Simulation Guided by Gemini Robotics
The next steps for this project include: Multi-Agent Coordination: Expanding the system to support multiple robots simultaneously, which will require complex traffic management, deadlock resolution, and collision avoidance. Dynamic Obstacle Avoidance: Implementing real-time path replanning (such as D* Lite) when the robot's LIDAR encounters unexpected obstacles (like a dropped box or a human worker) in the warehouse. Vision-Language-Action (VLA) Expansion: Feeding the simulated LIDAR and spatial grid directly back into Gemini as an image or matrix, allowing the LLM to make spatial reasoning decisions on the fly rather than just initial mission planning.
Built With
- googleaistudio
Log in or sign up for Devpost to join the conversation.