Inspiration

Finding personalized style and fashion fit and styling advice often requires juggling multiple apps, platforms, and sources. We wanted to create a single, intuitive fashion assistant that lives where users already communicate: on WhatsApp. Our goal was to build a simple, multi-agent system that could understand a user's unique style, curate products, and even let them virtually "try on" clothes, all through a simple chat.

What it does

Mtindo is a personal AI fashion stylist that operates as a conversational assistant on WhatsApp. It helps users with everything from finding outfit inspiration to managing their personal wardrobe. The system is built on a GKE-powered microservice architecture and uses a suite of specialized agents to perform different tasks: Including managing a user's style profile, provides personalized recommendations, and helps them build outfits. Interacts with an external e-commerce API to find products and fulfill orders. A creative visual assistant that uses a state-of-the-art image Imagen generation model to create realistic virtual try-on images by combining a product image with a user-uploaded photo.

How we built it

We built Mtindo using the Google Agent Development Kit (ADK), which enabled us to define and orchestrate a team of specialized, modular agents. Each agent runs as a separate microservice and is deployed on Google Kubernetes Engine (GKE).

The agents communicate with each other using our custom Microservice Context Protocol (MCP), a lightweight JSON-RPC protocol. This allowed for seamless, type-safe agent-to-agent communication, ensuring that complex tasks like virtual try-on requests were handled efficiently.

For the user-facing interface, we leveraged the WhatsApp Business Platform API to route incoming messages to our WhatsApp Router, which acts as a central hub. Data persistence for user profiles, shopping lists, and try-on history is handled by Google Cloud Firestore. The core of the conversational and creative capabilities relies on Google’s generative models, specifically gemini-2.0-flash-exp for conversational understanding and imagen-3.0-generate-002 for virtual try-on image generation.

Challenges we ran into

The primary challenge was managing state and context across a distributed, multi-agent system. While ADK simplifies agent creation, ensuring that the right information (e.g., a user's style profile or a pending order ID) was available at the right time for the right agent required careful design of the MCP and the Firestore data layer. Another significant challenge was orchestrating the deployment of our microservices on GKE, ensuring they could discover and communicate with each other reliably in a containerized environment. This required a deep understanding of Kubernetes networking concepts.

Accomplishments that we're proud of

We are most proud of successfully building a cohesive, multi-agent system that demonstrates true specialization. Instead of a single monolithic model, our system effectively delegates complex tasks to a dedicated agent. This microservice approach on GKE makes the entire system scalable, resilient, and easier to debug. We are also proud of the seamless integration of imagen-3.0-generate-002 to create a magical virtual try-on experience directly within a simple chat interface.

What we learned

We learned firsthand the power of GKE for managing complex application deployments and the importance of a well-defined communication protocol like MCP for inter-service communication. We also gained valuable insights into the practical application of generative AI and Whats possible with Gemini and Google Cloud Platform, learning how to chain different models and specialized tools together to create a powerful, end-to-end solution.

What's next for Mtindo

The next steps for Mtindo are to expand its capabilities and make it even more powerful. We plan to:

Integrate with more e-commerce APIs to provide a wider range of products.

Handle exchanges and returns, creating a truly end-to-end shopping assistant.

Explore advanced image generation techniques to create more realistic and higher-resolution virtual try-on images.

Add multi-language support to make Mtindo accessible to a global audience.

Built With

Share this project:

Updates