Vendo - Turn a Photo Into a Sales Agent

The Problem That Inspired This

500 million small sellers in emerging markets: Brazil, Indonesia, Vietnam, Mexico, China are invisible online. They take blurry phone photos of their products, paste them into WhatsApp groups, and wait. Usually nobody replies.

Running a professional online store requires skills they don't have. Filming product videos requires equipment they can't afford. Speaking to buyers in multiple languages is impossible alone.

Meanwhile, big brands run 24/7 AI-powered live commerce experiences on TikTok and Shopee Live that convert at 3-5x the rate of static listings.

The gap between a street vendor in Jakarta and a brand with a $10M marketing budget has never been wider. Vendo exists to close it.


What Vendo Does

Upload one product photo. In under 2 minutes, Vendo:

  • Analyzes the image using ERNIE LLM to extract product name, description, category, and price
  • Writes a sales script in the seller's chosen language: English, Portuguese, Indonesian, Chinese, Spanish, or Vietnamese
  • Generates a cinematic product scene using Nano Banana image generation
  • Creates an AI presenter video : a realistic avatar speaking your script using HeyGen
  • Builds a complete shop page at a shareable link
  • Adds an AI chat assistant that answers buyer questions in their language with zero hallucination
  • Handles checkout via Stripe

The result: a live-commerce style storefront that looks like a TikTok Live , but runs automatically, 24/7, with no seller involvement after the initial upload.


How I Built It

Vendo is built entirely on MeDo using its plugin ecosystem:

Plugin Purpose
Large Language Model (ERNIE) Image analysis, script writing, chat responses
Image Generation (Nano Banana Pro) Product scene generation
Google Text Translation Multilingual UI
Stripe Payments Checkout

External APIs integrated:

  • HeyGen API - AI avatar video generation with ethnicity-matched presenters per language
  • Canvas API - Real-time video compositing (avatar floating over product scene)

The buyer shop page uses HTML5 Canvas globalCompositeOperation: 'screen' to composite the HeyGen avatar video over the Nano Banana product scene - creating the effect of an AI influencer presenting your product in a cinematic environment.

The AI chat uses a strict system prompt that feeds only real product data to the LLM - preventing hallucination and ensuring buyers only receive accurate information about products in that specific shop.


The Challenges

1. Avatar transparency HeyGen outputs MP4 videos - MP4 doesn't support alpha channels. Getting the avatar to appear to "float" over the product background required generating videos on a pure black (#000000) background and using Canvas screen blend mode, which treats black pixels as transparent.

2. Multilingual voice quality Built-in TTS voices sounded robotic in non-English languages. Solved by matching HeyGen's native voice library to avatar gender and language - using gender-matched female voices per language for natural delivery.

3. Pipeline reliability Chaining LLM → image generation → video generation into a single automated pipeline required careful error handling at each step, status polling for async operations, and graceful fallbacks.

4. Real-time canvas compositing Running a 60fps canvas draw loop that composites a background image, gradient overlay, and video simultaneously without frame drops required careful performance optimization.


What I Learned

  • Prompt engineering for LLMs is as important as the model itself , the difference between a generic script and a compelling sales pitch is entirely in how you structure the prompt
  • Canvas compositing unlocks visual effects that pure CSS cannot reliably achieve
  • The most impressive demos come from real data,
    every product image and avatar in Vendo is genuinely AI-generated, not mocked

The Vision

Vendo is the beginning of a world where every small seller , regardless of technical skill, language, or budget , has the same AI-powered sales presence as the world's biggest brands.

One photo. Two minutes. Six languages. Your AI sales agent is live.

Built With

Share this project:

Updates