Vendo - Turn a Photo Into a Sales Agent
The Problem That Inspired This
500 million small sellers in emerging markets: Brazil, Indonesia, Vietnam, Mexico, China are invisible online. They take blurry phone photos of their products, paste them into WhatsApp groups, and wait. Usually nobody replies.
Running a professional online store requires skills they don't have. Filming product videos requires equipment they can't afford. Speaking to buyers in multiple languages is impossible alone.
Meanwhile, big brands run 24/7 AI-powered live commerce experiences on TikTok and Shopee Live that convert at 3-5x the rate of static listings.
The gap between a street vendor in Jakarta and a brand with a $10M marketing budget has never been wider. Vendo exists to close it.
What Vendo Does
Upload one product photo. In under 2 minutes, Vendo:
- Analyzes the image using ERNIE LLM to extract product name, description, category, and price
- Writes a sales script in the seller's chosen language: English, Portuguese, Indonesian, Chinese, Spanish, or Vietnamese
- Generates a cinematic product scene using Nano Banana image generation
- Creates an AI presenter video : a realistic avatar speaking your script using HeyGen
- Builds a complete shop page at a shareable link
- Adds an AI chat assistant that answers buyer questions in their language with zero hallucination
- Handles checkout via Stripe
The result: a live-commerce style storefront that looks like a TikTok Live , but runs automatically, 24/7, with no seller involvement after the initial upload.
How I Built It
Vendo is built entirely on MeDo using its plugin ecosystem:
| Plugin | Purpose |
|---|---|
| Large Language Model (ERNIE) | Image analysis, script writing, chat responses |
| Image Generation (Nano Banana Pro) | Product scene generation |
| Google Text Translation | Multilingual UI |
| Stripe Payments | Checkout |
External APIs integrated:
- HeyGen API - AI avatar video generation with ethnicity-matched presenters per language
- Canvas API - Real-time video compositing (avatar floating over product scene)
The buyer shop page uses HTML5 Canvas
globalCompositeOperation: 'screen' to composite
the HeyGen avatar video over the Nano Banana product
scene - creating the effect of an AI influencer
presenting your product in a cinematic environment.
The AI chat uses a strict system prompt that feeds only real product data to the LLM - preventing hallucination and ensuring buyers only receive accurate information about products in that specific shop.
The Challenges
1. Avatar transparency HeyGen outputs MP4 videos - MP4 doesn't support alpha channels. Getting the avatar to appear to "float" over the product background required generating videos on a pure black (#000000) background and using Canvas screen blend mode, which treats black pixels as transparent.
2. Multilingual voice quality Built-in TTS voices sounded robotic in non-English languages. Solved by matching HeyGen's native voice library to avatar gender and language - using gender-matched female voices per language for natural delivery.
3. Pipeline reliability Chaining LLM → image generation → video generation into a single automated pipeline required careful error handling at each step, status polling for async operations, and graceful fallbacks.
4. Real-time canvas compositing Running a 60fps canvas draw loop that composites a background image, gradient overlay, and video simultaneously without frame drops required careful performance optimization.
What I Learned
- Prompt engineering for LLMs is as important as the model itself , the difference between a generic script and a compelling sales pitch is entirely in how you structure the prompt
- Canvas compositing unlocks visual effects that pure CSS cannot reliably achieve
- The most impressive demos come from real data,
every product image and avatar in Vendo is genuinely AI-generated, not mocked
The Vision
Vendo is the beginning of a world where every small seller , regardless of technical skill, language, or budget , has the same AI-powered sales presence as the world's biggest brands.
One photo. Two minutes. Six languages. Your AI sales agent is live.
Built With
- ernie-llm
- google-text-translation
- heygen-api
- html5
- javascript
- medo
- nano-banana-pro-(image-generation)
- react
- stripe-payments
- supabase
- typescript
Log in or sign up for Devpost to join the conversation.