The Project Description
Voz — AI Commerce Infrastructure for Artisans the Internet Left Behind
What problem are we solving and who is it for?
Traditional and immigrant artisans represent one of the world's oldest and most culturally significant economic communities — and one of its most digitally excluded. Mexican textile weavers in East LA. Warli painters in Maharashtra. Vietnamese ceramicists in Garden Grove. Ethiopian basket weavers in Washington DC.
These makers produce work of extraordinary cultural and commercial value, yet they are systematically shut out of e-commerce by three compounding barriers:
- Language: Platforms are entirely English-first.
- Digital Literacy: Listing a product on platforms like Etsy requires navigating 15+ fields, SEO configuration, and complex payment setups.
- Infrastructure: There is no mechanism to photograph, describe, list, and sell without a computer and significant technical competence.
The result is that a $900 billion global artisan economy operates almost entirely through middlemen, informal markets, and word of mouth — channels that extract value from the maker and limit reach to a local radius. Consumers increasingly seek authentic, handmade, culturally significant products. The problem is a missing infrastructure layer between the maker and the market.
Voz exists to build that layer.
How our solution works
The vendor experience requires exactly three actions:
- Photograph the product.
- Speak about it in any language.
- Tap one button.
Within sixty seconds, they receive a live storefront URL they can share via WhatsApp with anyone in the world.
Behind those three actions, a multi-agent AI pipeline handles everything that would otherwise require digital literacy. The vendor never touches a form field, never writes a word of English, and never configures a website. The storefront itself includes a language toggle — so Priya, who speaks only Hindi, can switch her entire product page into Hindi to understand her own orders, prices, and customer messages. Her buyers see English; she sees her language. One tap.
Technical overview
Voz is built as a lightweight single-file frontend with a multi-agent backend pipeline. The frontend is a mobile-first progressive web application requiring no installation. Voice capture uses the browser's native MediaRecorder API.
The agent pipeline consists of four parallel agents and one sequential review agent:
- The Voice Agent: Handles multilingual transcription, converting the vendor's spoken description from any of 50+ languages into structured text.
- The Vision Agent: Performs multimodal analysis of the product photograph, extracting material composition, craft category, estimated dimensions, and cultural visual markers.
- The Listing Agent: Generates a complete product listing (SEO-optimized title, 150-200 word cultural description in English, suggested tags, category classification).
- The Storefront Agent: Takes all structured outputs, generates a complete HTML product page, stores it, and returns a live public URL.
- The Ethics Review Agent: Runs sequentially after the first three complete. It performs checks for cultural accuracy, pricing fairness, and misrepresentation. Nothing publishes without its approval.
Storage and URL serving use Supabase. Pricing intelligence uses the Fetch.ai ASI1 LLM, an OpenAI-compatible endpoint that performs market comparables analysis.
The architecture is intentionally modular:
{
"pipeline": ["VoiceAgent", "VisionAgent", "ListingAgent"],
"on_success": "StorefrontAgent",
"final_gate": "EthicsReviewAgent"
}
New agents can be added to the pipeline without restructuring the existing system.
What could go wrong
We've identified five failure modes honestly:
- Transcription accuracy degrades: Lower-resource languages may struggle. We surface the raw transcript to the vendor before listing generation so errors can be caught.
- Inaccurate cultural descriptions: AI models can hallucinate. Our Ethics Agent catches egregious cases, but human review remains necessary.
- Systematically biased pricing: If comparable product data skews, recommendations may disadvantage vendors. We show the pricing rationale, not just the number.
- Payment friction: V1 routes buyers to WhatsApp for payment/shipping. This is familiar to vendors but not a seamless checkout for modern e-commerce buyers.
- Bad actors: Users could misrepresent mass-produced goods as handmade. Visual analysis and ethics checks provide partial mitigation.
Safeguards we've built in
- Real-time UI Review: Every listing is reviewed by the Ethics Agent visibly in the UI before publication.
- Embedded Audio: The vendor's voice note is embedded on the storefront as playable audio, creating unprecedented transparency.
- Visible Provenance: All listings note that they were AI-generated from a voice description.
- No Auto-corrects: The platform never auto-corrects a flagged listing; it surfaces the concern and waits for human decision.
- Final Approval: Vendors always see the generated listing before it goes live.
How Voz empowers rather than replaces people
The artisan's voice, story, and craft remain the center of every transaction. The AI does not invent a narrative — it translates and amplifies the narrative the vendor already told in her own words.
The skills the vendor uses to operate Voz are skills she already has. Voz creates economic access without creating economic dependency. Vendors own their storefront URL, customer relationships, and pricing. The platform does not extract the value of their craft — it builds the bridge to the people who want to buy it.
The founder of Voz has spent four years working directly with artisan communities in India, learning regional languages, and building this infrastructure through failure. The technology is designed by someone who has watched this problem destroy livelihoods in real time.
Ethical considerations
Cultural representation is not a UX problem — it is an ethical obligation. We generate descriptions from the vendor's own words. It is the difference between a platform that speaks for an artisan and one that helps her speak for herself.
- Pricing fairness: Modeled explicitly to flag extractive pricing (too low or too high).
- Graceful exit: Vendors can export their product data and are not locked into the infrastructure.
- Acknowledging imperfection: We openly acknowledge AI limitations and build human checkpoints to catch errors before they cause harm.
The vision behind Voz is consistent with Dario Amodei's framing in Machines of Loving Grace: AI that gives people access to expertise, markets, and opportunity they could not otherwise reach. A Warli painter in Palghar who can now sell in Los Angeles without speaking English, without owning a computer, and without navigating a platform built for someone else — that is what this vision looks like made concrete.## Inspiration
Built With
- anthropic-api
- asi1
- fetch.ai
- html
- openai-compatible
- pwa
- supabase
- whisper
Log in or sign up for Devpost to join the conversation.