Note:
- Vibe currently supports most direct brand websites, though certain e-commerce platforms like Amazon have anti-scraping software that blocks Vibe's extraction tools. In case the extracted data lacks an image or key fields, please try again with a different product URL.
- In case the hosted URL does not work (likely due to API key overuse), please use the Github link to run the project locally. The README offers a comprehensive, step-by-step explanation for testing.
- Ffmpeg is very CPU-dependent, but our Cloud Run server only has 1 vCPU, so the hosted project will take extremely long to process the final video ad (up to 45 minutes compared to 1 minute on local deployments)! We realized this during our last day when we deployed, so we unfortunately didn't get a chance to switch to cloud-based video processing.
Inspiration
The rise of short-form video content has made video advertising essential for business success, with platforms like TikTok, Instagram Reels, and YouTube Shorts driving over 70% of consumer engagement on social media. However, traditional video advertising is cost-prohibitive and time-consuming, often requiring thousands of dollars of production costs and several weeks of turnaround time. It also demands specialized skills across market research, audience analysis, creative scripting, and video production.
We saw an opportunity to democratize video advertising by leveraging multi-agent frameworks and AI video generation to transform a simple product URL into a professional, market-ready video advertisement in minutes rather than weeks. We hope to empower small businesses and entrepreneurs to create studio-quality video ads with just a product link and a few clicks.
What it does
Vibe is an end-to-end, multi-agent video generation platform that transforms product URLs into professional video advertisements through a seamless 6-step pipeline:
Step 1: Product Analysis
The extraction_agent uses Tavily’s web scraping API to extract detailed product metadata — including images, pricing, features, and descriptions — from a user-provided product URL.
Step 2: Market Intelligence
The market_agent conducts deep market research using the extracted data, identifying competitors, target demographics, and current market trends relevant to the product category.
Step 3: Script Generation
The script_agent uses product and market insights to generate a short-form video script, including both audio narration (A-roll) and visual direction (B-roll) tailored for high engagement.
Step 4: A-Roll Production
The a_roll_agent generates presenter-style videos using HeyGen’s avatar and voiceover capabilities, bringing the A-roll script to life with studio-quality narration.
Step 5: B-Roll Creation
The b_roll_agent calls Google’s Veo 2 to produce supporting visuals, product showcases, and cinematic sequences that match the visual script.
Step 6: Final Assembly
The processing_agent uses FFmpeg to combine A-roll and B-roll into a polished final cut, ready for download or immediate posting on social media.
How We Built It
We built Vibe using Google’s Agent Development Kit (ADK) to construct a fully autonomous, multi-agent pipeline. Our system demonstrates core ADK capabilities including agent orchestration, sequential workflows, tool calls, and shared state. We also integrated many Google Cloud products, such as Gemini 2.0 Flash, Veo 2, Cloud Storage, and Cloud Run.
1. Orchestrator
The manager_agent is the root agent that coordinates all sub-agents. It governs the execution order, handles failure retries, and ensures data flows properly through the system using ADK’s built-in state and thread constructs.
2. Metadata Extraction
We designed two agents to extract and preprocess product information:
- extraction_agent uses Tavily’s extract API to parse the product URL and gather structured metadata, including an image_url.
- save_agent saves the image_url and encodes the image in base64, storing it into a shared state.
These are then combined into a SequentialAgent called analysis_agent, ensuring deterministic execution and correct state transitions.
3. Market Analysis
Our market_agent accesses tools for:
- Market size and trend analysis
- Audience demographics and psychographics
- Competitor research
It uses metadata from earlier agents and Tavily’s search API to generate structured market insights, which are saved to state for later use.
4. Script Generation
script_agent combines metadata and market analysis to generate a 10-second video script. It uses Gemini via tool-calling prompts and adheres to a strict output schema, producing A-roll (voiceover) and B-roll (visuals) scripts.
5. Video Generation
We generate two types of footage:
- a_roll_agent calls HeyGen to create avatar-driven A-roll based on the voiceover script.
- b_roll_agent calls Veo 2 to produce cinematic B-roll using the visual script.
Both agents save their outputs (MP4s) to shared state.
6. Post-Processing and Export
processing_agent uses FFmpeg to stitch A-roll and B-roll clips together. The final video is uploaded to a Google Cloud Storage bucket. All agents are containerized and deployed via Cloud Run for scalability and reliability.
Frontend Integration
The frontend was developed using TypeScript, Next.js, and Tailwind CSS. It communicates with the backend through FastAPI endpoints exposed by ADK’s api_server, enabling:
- Submission of product URLs
- Real-time progress tracking across agent stages
- Previewing of metadata, scripts, and final video output
The frontend is deployed on Vercel for rapid iteration and global delivery.
Challenges we ran into
Structuring the backend was tricky, since it was our first time using the Google ADK framework. Initially, we designed manager_agent as a SequentialAgent, but we later realized this caused all the subagents to run without pause, so we decide to use prompting to control each step instead.
Working with HeyGen and Veo 2 was hard as well, since we needed to parse the returned URLs to both save to state for post-processing and make public for frontend display. We ended up using artifacts extensively, as well as detailed regex code on the frontend.
Deployment took a long time, since we never used Cloud Run before but wanted to integrate more Google Cloud products into our project. We had to modify some polling functions to prevent timeout issues with video generation.
Accomplishments that we're proud of
We successfully created an end-to-end, multi-agent pipeline for URL generation, verifying that it is possible to lower video ad production costs for small businesses and entrepreneurs.
We integrated many Google Cloud products into our project, including Gemini, Veo 2, Cloud Storage, and Cloud Run, which covers all aspects from URL extraction to deployment.
Our team is global right now (summer in different countries), so we're very happy with our teamwork and collaboration!
What we learned
We learned how to use many new frameworks, including Google ADK and various Tavily, HeyGen, and Veo 2 APIs, as well as new Cloud Storage and Cloud Run techniques.
Prompting is very effective for orchestration. While we initially planned to use a deterministic method for controlling each subagent in the pipeline, we found that prompting the manager_agent allowed for more flexibility.
Communication is key! With our team spread across the globe, it was important to keep each other updated on progress and optimize our strengths.
What's next for Vibe
Extended platform support: We aim to expand URL compatibility to include platforms like Amazon, Etsy, and Walmart by exploring alternative metadata extraction techniques or official APIs.
User customization: Add more frontend options for users to tweak script tone, avatar style, B-roll pacing, and call-to-action preferences — making the output feel even more tailored.
Advanced post-processing: Integrate features like subtitle overlays, brand logo watermarking, and auto-cropping for different platforms (e.g. vertical for TikTok, square for Instagram).
Built With
- ffmpeg
- gemini-2.0-flash
- google-adk
- google-cloud
- google-cloud-run
- google-veo-2
- heygen
- javascript
- next.js
- python
- react
- tailwind
- tavily
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.