Revised Project Report — GEO Agent: Luckin Coffee AI Visibility Auditor

Inspiration

Luckin Coffee — with 21,000+ stores worldwide and rapid international expansion — opened its first U.S. stores in Manhattan and is building a presence in New York City. Yet when a New Yorker asks ChatGPT "best coffee near Times Square," Luckin barely appears. Traditional SEO doesn't help here — AI models synthesize answers from training data, not search rankings. We built the GEO Agent to measure and fix this invisible problem: your brand's Share of Model in generative AI.

What it does

The GEO Agent answers two questions for brand leadership:

1. "How visible are we in AI search?"

  • Sends 13 high-intent queries across 7 categories ("best coffee near Grand Central", "cheapest latte Manhattan", "Luckin vs Starbucks") to three LLMs simultaneously
  • Queries ChatGPT (GPT-4o), Claude, and Perplexity live — no cached or mock data
  • Computes Share of Model (SoM): the percentage of AI answers that mention the brand
  • Extracts mention position, competitor share of voice, sentiment, and citation presence
  • Breaks down SoM by LLM provider, revealing which AI platforms favor which brands
  • Provides a model-by-model comparison showing how each AI perceives the brand differently

2. "How do we improve?"

  • Uses Tavily web intelligence to analyze the brand's web footprint vs. competitors
  • Identifies 20 content gap topics where the brand lacks online presence
  • Auto-generates four ready-to-use content pieces via Claude: a Wikipedia-style brand summary, a competitor comparison outline, Schema.org JSON-LD structured data, and a GEO content strategy
  • Produces prioritized recommendations (rule-based + LLM-generated) ranked by impact and effort
  • Includes an illustrative revenue impact model showing how SoM improvements translate to customer acquisition

How we built it

Architecture: Multi-LLM Scanner → Brand Extractor → SoM Scorer → Tavily Gap Analyzer → Content Generator → Interactive Dashboard

Frontend: Next.js 14 (App Router) + Tailwind CSS + Recharts, deployed on Vercel

AI Systems (3 LLMs queried live per audit):

  • OpenAI GPT-4o — LLM response scanning and brand mention analysis
  • Anthropic Claude Sonnet 4.5 — cross-model comparison + content generation engine
  • Perplexity Sonar — provides real citations that other LLMs don't, critical for citation gap analysis

Web Intelligence: Tavily API for real-time web search across 4 parallel queries, content extraction, and competitor presence analysis

Brand Extraction Engine: Custom NLP pipeline with 22 known coffee brands, 50+ alias mappings, regex-based entity detection with word boundary handling, sentence context capture, position tracking, and keyword-based sentiment analysis

Key design decisions:

  • LLM-agnostic scanning: identical queries sent to all 3 models for apples-to-apples comparison
  • Evidence-first: every finding links back to raw LLM responses and cited URLs
  • Progressive rendering: core audit results display immediately while web intelligence and content generation load in parallel
  • Circuit breaker pattern: automatically skips providers after repeated failures to prevent one slow API from blocking the entire audit
  • Live-only: if APIs fail, the system shows errors — no fallback to static data

Challenges we ran into

  • Entity disambiguation: LLMs sometimes say "joe coffee" generically (slang for any coffee) vs. referring to the Joe Coffee brand — required careful entity extraction with word boundary matching and 50+ alias patterns
  • Rate limiting & timeouts: Running 13 queries × 3 LLMs = 39 API calls per audit — needed parallel execution with per-query timeouts and a circuit breaker to prevent cascading failures
  • Tavily crawlability: Luckin's website (luckincoffee.us) is heavily JavaScript-rendered, making content extraction difficult — which is itself a key finding (if Tavily can't read it, LLMs can't either)
  • Fraud narrative persistence: Some LLMs still surface Luckin's 2020 accounting scandal — our sentiment detection flags these negative mentions so leadership knows where reputation management is needed
  • LLM response variability: The same question asked twice can yield different brand mentions — we mitigate this with 13 diverse questions across 7 categories to get statistically meaningful SoM scores

Accomplishments that we're proud of

  • Fully live, zero-mock architecture: Every data point on the dashboard comes from real-time API calls — SoM scores, brand mentions, web intelligence, and recommendations are all computed fresh per audit
  • 3-model comparison reveals non-obvious insights: We discovered that different AI platforms treat brands very differently — Perplexity and ChatGPT may mention Luckin while Claude doesn't, revealing platform-specific optimization opportunities
  • Actionable output, not just analytics: The auto-generated Wikipedia summary, Schema.org structured data, and competitor comparison outline are ready to implement — not theoretical advice
  • End-to-end in one click: From "Run Agent" to a full audit report with visualizations, competitor analysis, content gaps, generated content, and prioritized recommendations — all in under 30 seconds

What we learned

  • GEO is the new SEO: Princeton research shows that adding statistics and credible citations to web content boosts AI visibility by 30-40%. Traditional keyword stuffing actually decreases visibility by 10%.
  • The visibility gap is massive: AI recommends only a small fraction of businesses in a given area. Appearing in an AI answer is significantly harder than ranking in traditional search results.
  • Each LLM has its own worldview: ChatGPT, Claude, and Perplexity surface different brands for the same query — optimizing for one doesn't guarantee visibility across all three. A multi-model strategy is essential.
  • Web presence is table stakes: Brands without strong Wikipedia articles, structured data, and review coverage are virtually invisible to AI. Luckin's JS-heavy website is a concrete example of this gap.

What's next for GEO Agent

  • SoM tracking over time: Monthly monitoring to measure whether content strategy changes actually improve AI visibility
  • Misinformation detection: Flag when LLMs make incorrect claims about the brand (wrong store count, outdated negative narratives)
  • Multi-brand support: Extend to any brand in any category — the architecture is brand-agnostic, just change the target brand and competitor list
  • Interactive "Ask the Agent" mode: Follow-up conversational interface for deeper analysis on any specific finding
  • Automated content deployment: Push generated Schema.org markup and content recommendations directly to the brand's CMS

Built With

  • anthropic-claude
  • next.js
  • node.js
  • openai-gpt-4o
  • perplexity-api
  • react
  • recharts
  • tailwind-css
  • tavily-api
  • vercel
Share this project:

Updates