đźš› WorldMovers Logistics Nexus
An Interactive, Multimodal Mission Control for Modern Logistics, Powered by Google's Agent Development Kit (ADK)
Inspiration
In today's global supply chain, logistics data is fragmented and siloed. A logistics coordinator might check one system for inventory levels, another for fleet vehicle status, and a third-party site for weather conditions. Getting a simple, holistic answer is a manual and time-consuming process.
We were inspired by the paradigm of Google's Agent Development Kit (ADK): what if, instead of logging into multiple systems, you could just ask a single, intelligent agent? We wanted to build a centralized "mission control" for logistics, powered by a smart agent that could not only answer questions but also see the world, use tools to fetch live data, and act on its own initiative.
🤖What it does
WorldMovers Logistics Nexus is a cloud-native, AI-powered dashboard that gives logistics coordinators a unified view of their entire operation. At its core is an advanced multimodal agent, built on the principles of Google's ADK, that acts as a single point of contact for all data inquiries.
Key Features:
Advanced Multimodal AI Assistant: This is far more than a chatbot. Users can interact with the agent through a variety of inputs:
- Text & Image Understanding: Ask complex questions about an uploaded image. For example, upload a photo of a damaged package and ask, "Assess the damage to this box. Is it still safe for international shipping?"
- Live Camera Input: Use the device's camera to capture and send images directly to the AI for instant analysis, perfect for on-the-floor operations.
- Voice Feedback (TTS): The AI's responses are converted into high-quality audio and played automatically, providing an accessible and engaging user experience.
- Downloadable Chat Logs: For compliance and record-keeping, entire conversations with the AI can be downloaded as a professional PDF or DOCX file.
Live Tool Usage (ADK in Action): The agent is not just a chatbot. It uses ADK-inspired tools to fetch real-time data, such as
get_weatherandget_current_time, providing users with up-to-the-minute information that would otherwise require leaving the platform.Unified Operational Dashboard: It provides a single pane of glass to visualize key logistics data, including an interactive inventory management table and a live map of the automated fleet of trucks and drones.
Enterprise-Grade Analytics & Monitoring: Every single interaction with the AI agent—every query, response, tool call, error, and latency—is securely logged in real-time to a structured table in Google BigQuery, enabling powerful analytics and performance monitoring.
Scalable by Design: The entire application is containerized with Docker and built to be deployed on Google Cloud Run, ensuring it can scale effortlessly from one user to thousands.
⚙️How we built it
We built the WorldMovers Logistics Nexus using a modern, cloud-native stack, with Google Cloud at its heart.
Core AI "Agent Engine": We used the powerful multimodal capabilities of Google's Gemini models, structured using the principles of the Google Agent Development Kit (ADK). Its native function-calling was the key to implementing our tool-using agent, and its ability to process images and text together powers our entire multimodal chat experience.
Web Frontend: The interactive dashboard was rapidly built using Streamlit. We utilized its rich component library, including
st.camera_inputandst.download_button. The user interface was further enhanced with:- gTTS (Google Text-to-Speech) to create and play audio responses.
- FPDF2 and python-docx to dynamically generate downloadable PDF and Word documents from chat histories.
Cloud Platform & Deployment: The entire solution is built on the Google Cloud Platform.
- Containerization: We used Docker to package our Streamlit application and all its dependencies into a portable container.
- CI/CD & Builds: Google Cloud Build automatically takes our code, builds the Docker image, and pushes it to the Artifact Registry.
- Serverless Compute: We use Google Cloud Run as our Agent Engine. It provides a fully managed, auto-scaling, and secure environment to run our container.
Data, Analytics & Logging: Google BigQuery serves as our analytics backbone. We designed a specific schema and a logging function in our Python code to send structured data about every agent interaction directly to a BigQuery table.
Security: All sensitive credentials, like the Gemini API Key and BigQuery service details, are securely stored in Google Secret Manager and injected into the Cloud Run environment at runtime, ensuring they are never hard-coded.
đźš§Challenges we ran into
Building a cloud-native application from scratch came with real-world challenges that we were proud to overcome:
Dependency and Version Conflicts: We initially faced an
ImportError: cannot import name 'Part'. This taught us the critical importance of precise dependency management. We solved it by pinning the exact version ofgoogle-generativeai>=0.5.2in ourrequirements.txtto ensure the modern function-calling features were available.Environment Mismatch (Local vs. Cloud): Our app worked locally using
st.secretsbut crashed on Cloud Run. This was a crucial learning moment about the difference between local development and a production environment. We refactored our code to useos.environ.get(), the cloud-standard way of handling secrets.Cloud Permissions (IAM): When we first deployed, our app couldn't write to BigQuery. This forced us to dive into GCP's Identity and Access Management (IAM). We solved it by granting the
BigQuery Data Editorrole to the service account our Cloud Run instance was using.Complex UI State Management: Implementing the multimodal chat with TTS and downloadable reports led to subtle bugs like
AttributeErrorandRuntimeErrordue to Streamlit's execution model. We solved these by carefully managing session state and ensuring correct data types (bytesvs.bytearray) were used for all widgets.
🏆 Accomplishments that we're proud of
- A Rich, Interactive Multimodal Experience: We moved far beyond simple text chat. By integrating camera input, voice responses (TTS), and downloadable reports, we created a truly user-centric application that feels modern and highly functional.
- A True Tool-Using Agent: We successfully implemented the core concept of the ADK. Our agent doesn't just generate text; it actively uses tools to retrieve external information, making it a functional and practical assistant.
- Designing for a Multi-Agent Future (ADK Requirement): The next frontier is to grant the agent the ability to perform actions and delegate tasks. We've built the foundation for this by enabling Autonomous Actions (like placing a supply order with user confirmation) and designing the architecture to support a Multi-Agent System, where our main Logistics Agent could delegate tasks to a specialized "Finance Agent" for cost calculations. This is a true multi-agent implementation vision.
- End-to-End Cloud Native Application: We didn't just write a script. We built and deployed a complete, scalable, and secure application on Google Cloud, from containerization with Docker and Cloud Build to serverless execution on Cloud Run.
- Built-in Observability: Integrating BigQuery from the start is a major accomplishment. Our application has enterprise-grade logging built-in, allowing us to instantly query and visualize the agent's performance, user engagement, and error rates.
🏫What we learned
- The Power of Function Calling & Multimodality: We learned firsthand how Gemini's capabilities are the key to transforming an LLM into a true agent that can interact with the world and solve problems using multiple modes of information.
- Infrastructure as a Solved Problem: Using Google Cloud Run allowed us to focus on our application's logic and user experience without worrying about managing servers, scaling, or infrastructure.
- Designing for the Cloud: We learned the importance of writing code that is "cloud-aware," especially regarding configuration and security (using environment variables for secrets instead of local files).
- Data is Everything: By logging to BigQuery, we learned how easy it is to capture high-quality data that can be used to create dashboards, monitor performance, and ultimately improve our agent over time.
🚀What's next for WorldMovers Logistics Nexus
This is just the beginning. We have a clear vision for making our Agent Engine even more powerful.
- Expand the Toolkit: Integrate with real-world APIs for live shipment tracking (e.g., FedEx, DHL), freight cost calculation, and currency conversion to provide instantly bookable quotes.
- Introduce Proactive Agents: Elevate the agent from being reactive to proactive. It could monitor for events, like a weather delay on a shipping route, and proactively alert the user: "I see your shipment to London may be delayed due to a storm warning. Would you like me to send a notification to the recipient?"
- Live Looker Studio Dashboard: Build a comprehensive monitoring dashboard in Looker Studio directly on top of our BigQuery data to visualize key metrics like average latency, tool usage breakdown, and daily error counts in real-time.
- Persistent Memory & User Profiles: Integrate a vector database (like Pinecone or Cloud SQL with pgvector) to give the agent long-term memory of past conversations and user preferences, allowing for a truly personalized experience.
Built With
- docker
- gemini-2.5
- gemini-api
- google-artifact-registry
- google-bigquery
- google-cloud-build
- google-cloud-platform-(gcp)
- google-cloud-run
- google-generative-ai
- google-secret-manager
- pandas
- python
- sql
- streamlit



Log in or sign up for Devpost to join the conversation.