TaoSage - Generative Engine Optimization for Taobao Arbitrage

Live: https://geo-agent-five.vercel.app Stack: Next.js 16 + Gemini + Tavily + Upstash Vector + OpenRouter Theme: shadcn/ui Indigo / Neutral / Nova / Nunito Sans


What It Does

GEO Agent helps users discover Taobao arbitrage opportunities by comparing Chinese supplier products against US market alternatives. Users describe a product in chat, the system runs a full multi-step analysis pipeline, and opens a detailed report in a new tab with:

  • Arbitrage Comparison - Taobao vs US alternatives side-by-side with savings %
  • Share of Model (SoM) - How visible you are in AI-generated answers
  • Competitor Analysis - Who dominates the AI landscape for your product
  • Gap Analysis - Where opportunities exist
  • Generated Content - SEO/GEO-optimized blog posts, FAQs, schema markup (JSON-LD)
  • Interactive Charts - Savings bar chart, SoM pie chart, competitor mentions bar chart

When multiple users search for similar products (conversation-level similarity > 0.80), the system auto-generates a static SEO blog guide at /guides/[slug] via ISR.


Architecture

User Chat Input
      |
      v
  /api/analyze (SSE streaming)
      |
      +--> Step 0: Store conversation in Upstash + check similarity gate
      +--> Step 1: Generate search queries (Gemini)
      +--> Step 2: Translate queries to Chinese (Gemini)
      +--> Step 3: Search Taobao (Tavily, include_domains) + translate results to English
      +--> Step 4: Search US market (Tavily + Perplexity + GPT-4o-mini in parallel)
      +--> Step 4.5: Generate arbitrage comparison (Gemini)
      +--> Step 5: Compute SoM scores
      +--> Step 6: Embed & match products via Upstash Vector
      +--> Step 7: Gap analysis (Gemini)
      +--> Step 8: Content generation (Gemini)
      |
      v
  Report opens in /report (reads from localStorage)
      |
      v
  If similarity gate triggered --> POST /api/guides --> static guide at /guides/[slug]

How It Works

1. Chat-First Flow

User types a product description (e.g., "brandable toy water guns for promotional distribution"). The first message triggers the full /api/analyze SSE pipeline. Progress updates stream back in real-time. When complete, the report auto-opens in a new tab.

2. Arbitrage Comparison

The pipeline searches Taobao (via Tavily with Chinese queries) and the US market (via Tavily + Perplexity + GPT) in parallel. Gemini then generates a structured comparison: same need, Taobao offering vs US alternative, with pricing, MOQ, lead times, customization, and estimated savings %.

3. Similarity Gate (Auto-Guide Generation)

Every user's full conversation history is embedded as a single vector in Upstash. When a new analysis runs, the system checks if 1+ other users have similar conversation histories (cosine similarity > 0.80). If triggered:

  • A "How to Source [Product] from Taobao" blog post is auto-generated
  • Saved to Upstash as a guide document
  • Served as a static ISR page at /guides/[slug]

For the 2-person demo, minUsers=1 (1 other user triggers the gate).

4. Follow-Up Chat

After the report generates, users can ask follow-up questions in the same chat. These go through /api/chat with the full report as context, powered by Gemini function calling with 6 tools (search, translate, analyze, etc.).


Charts

Three Recharts-based visualizations using shadcn/ui <ChartContainer>:

  1. Savings by Category (horizontal bar) - % savings per arbitrage item
  2. SoM Distribution (donut/pie) - competitor share of AI-generated answers
  3. Competitor Mentions & SoM (grouped bar) - mentions + SoM% side by side

Deployment

Vercel

vercel --prod
  • Free tier: 60s function timeout (pipeline may need optimization)
  • Pro tier: maxDuration = 300 works as configured
  • Set all 5 environment variables in Vercel dashboard

Local Development

npm install
# Create .env.local with all 5 variables
npm run dev

Models Used

Model Provider Purpose
gemini-3-flash-preview Google Gemini Translation, queries, analysis, content, arbitrage comparison, embeddings
text-embedding-004 Google Gemini Vector embeddings (768 dimensions)
perplexity/sonar OpenRouter AI search with citations
openai/gpt-4o-mini OpenRouter Market analysis, competitor research

Key Design Decisions

  1. Conversation-level embedding (not per-query) - Captures user intent across entire sessions for better similarity matching
  2. Taobao URLs stay in Chinese - Source links preserved as-is, only titles/descriptions translated to English
  3. localStorage bridge for report - Chat page stores report, report page reads it. Simple, no server state needed
  4. SSE streaming - Real-time progress updates during the 30-60s pipeline
  5. Parallel API calls - Tavily + Perplexity + GPT run simultaneously in Step 4
  6. Null safety guards - All Gemini JSON responses guarded against missing fields (impactTags || [], gaps?.length, Array.isArray())

Built With

  • claude
  • github
  • nextjs
  • redbull
  • tavily
  • vercel
  • vscode
Share this project:

Updates