Inspiration

Imagine standing on the third floor of Siam Paragon in Bangkok. You can't read a single sign because the entire directory is in Thai script, and every storefront looks exactly the same. The salespeople smile politely, but they speak almost no English. You have no way of knowing if the silk you are looking at is real, whether you are supposed to bargain, or that the stall just three steps away sells polyester at five times the price of the authentic brand one floor up. Ultimately, you leave with nothing, or even worse, you accidentally buy a fake.

This frustrating experience is incredibly common, with over $1 \times 10^9$ (one billion) cross-border mall visits happening every year. Every single one of those tourists faces these exact same obstacles, struggling with the wrong language, the wrong floor, the wrong store, and zero cultural context. Because of language barriers, navigation issues, and cultural mismatches, brick-and-mortar retail loses an estimated $Q_{\text{lost}} \approx \$1.1 \text{ trillion}$ annually. We built Google Concourse to finally solve this problem.

What It Does

Google Concourse is a multilingual, multimodal, multi-agent shopping concierge built for the world's largest malls. A tourist can simply speak or type in their native language, or point their camera at anything from a sign or a price tag to a product in a window, to set ten specialized AI agents to work.

  • Vision: Reads local script, identifies products, and extracts prices.
  • Intent: Parses specific constraints such as budget, payment method, accessibility needs, and time.
  • Translation: Bridges languages seamlessly across the entire pipeline.
  • Cultural Concierge: Our main differentiator, which uses vector Retrieval-Augmented Generation (RAG) to flag tourist traps, find authentic vendors, and explain local payment norms and customs.
  • Search: Finds matching products across every tenant in both languages simultaneously.
  • Routing: Builds an accessible, step-free walking path through the mall.
  • Calling: Phones the store in the local language to hold the item.
  • Booking: Writes a time-to-live (TTL) indexed hold to MongoDB Atlas.
  • Tenant Copilot: Lights up the merchant's dashboard in real time with a full incoming-customer brief.

As a result, the merchant knows the shopper's language, budget, accessibility needs, and arrival time before they even walk in the door, allowing the shopper to walk in and find their item already waiting on the counter.

How We Built It

To build this, we connected two user surfaces, a shopper mobile PWA and a merchant dashboard, using ten ADK agents communicating over the A2A protocol, with MongoDB Atlas serving as the unified data, memory, and cultural-knowledge spine. Every agent is independently deployable. The Orchestrator drives the pipeline, streaming live agent-activity events to the shopper's "Thinking" screen as each specialist fires. The two-sided handoff, moving from a shopper hold to a MongoDB Change Stream and then a merchant dashboard update, serves as our headline demo moment and required zero polling to implement.

We used five MongoDB Atlas features as load-bearing infrastructure rather than just decoration:

  1. Atlas Vector Search for product matching and cultural RAG.
  2. Atlas Search for fuzzy multilingual keywords.
  3. Time-Series Collections for mall telemetry.
  4. Aggregation Pipelines for multi-signal ranking algorithms, establishing a score $S$ based on relevance, distance, and reviews: $$S = w_1 \cdot \text{Relevance} + w_2 \cdot \text{Distance}^{-1} + w_3 \cdot \text{Rating}$$
  5. Change Streams for the real-time two-sided handoff.

Best of all, the entire stack runs with zero API keys in mock mode, utilizing deterministic fallbacks for every external dependency so the demo never breaks on stage.

Challenges

Our hardest challenge was perfecting the Cultural Concierge Agent. Building a RAG corpus that is genuinely trustworthy, especially when distinguishing an authentic vendor from a tourist trap based on multilingual review signals, required careful curation and a strict rule to never fabricate information. A degraded but honest answer always beats a confident wrong one, especially when someone's $\$300$ gift is on the line.

The second challenge was managing the two-sided real-time handoff. Keeping the shopper PWA and merchant dashboard perfectly in sync through a hold write, a stock decrement, a Change Stream, and a WebSocket push, while maintaining identical behavior with or without a live Atlas connection, required the in-memory repository to mirror every observable behavior of the Mongo repository exactly.

What We Learned

In the end, we realized that three technologies matured at exactly the same moment to make this possible: Gemini's multimodal reasoning, Google ADK's multi-agent orchestration, and MongoDB Atlas's unified operational + vector + streaming stack. Any one of these missing two years ago would have turned this into a long-term research project. Together, they allowed us to build a highly functional platform in a tight, three-week hackathon timeline.

Built With

  • a2a-protocol
  • atlas-search
  • atlas-vector-search
  • elevenlabs
  • gemini-2.5
  • google-adk
  • google-cloud-run
  • google-maps-platform
  • hono
  • mongodb-atlas
  • mongodb-change-streams
  • mongodb-mcp-server
  • next.js
  • node.js
  • pnpm
  • python
  • tailwind-css
  • twilio
  • typescript
  • voyage-ai
  • zustand
Share this project:

Updates