Inspiration

The internet is transitioning from text-heavy pages to short-form video, but creating high-quality video is still a manual, expensive bottleneck. Whether it’s an e-commerce brand wanting to turn a product page into a TikTok ad or a SaaS founder needing a feature spotlight, the process is the same: scrape the data, write a script, and manually animate it.

What it does

LaunchCut turns any URL into a production-ready promo video. It treats the web as a database for video content.

Omni-Scraper: Unlike specialized tools, our scraper agent can ingest any website. It extracts brand colors (CSS variables), high-resolution hero images, product descriptions, and pricing models.

The Brain (Analyst & Scriptwriter): These agents distill the raw scrape into a high-conversion script. They identify the "pain point" and the "solution" based on the website's copy.

The Code Agent (OpenCode, Daytona MCP): This agent uses the script given to it to programmatically build the video in realtime and runs/renders it on Daytona sandboxes. Our OpenCode agent uses the DaytonaMCP to perform operations on sandboxes.

How we built it

Video-as-Code (Remotion): We moved away from "generative video" (which is often blurry or hallucinatory) in favor of programmatic video. This allows for pixel-perfect text rendering and brand-accurate animations.

Agentic Infrastructure:

Daytona: We used Daytona to spin up standardized environments where our agents can execute code safely. This allows the "Director" to iterate on the video layout without human intervention.

OpenCode : The opensource coding agent we're using to run headless coding sessions on Daytona.

MCP Integration: By exposing the Daytona sandbox via MCP, we gave our agents the ability to read and write files as if they were a human developer at a terminal.

Challenges we ran into

Headless browsing on Daytona : We had to overcome ffmpeg multithreading bottleneck issues and installing chrome headless browing libraries for remotion to work.

The "State" of Video: Coordinating the timing between the ElevenLabs audio duration and the Remotion frame count required building a custom synchronization hook. If the script was 12.5 seconds, the React animation had to be exactly 375 frames (at 30fps).

Sandbox Security: Giving an LLM the ability to write and execute code in a sandbox (Daytona) is powerful but tricky. We had to ensure the MCP layers were robust enough to handle complex file system operations without crashing the render process.

Dynamic Asset Loading: Programmatically injecting scraped images and logos into a Remotion template while maintaining a consistent "vibe" required sophisticated CSS-in-JS logic generated on the fly.

Accomplishments that we're proud of

Zero-Human Production: We successfully took a website and turned it into an "Explainer" video without touching a single line of code.

The Daytona/MCP Bridge: Implementing the Model Context Protocol to give our AI agent a physical "workspace" felt like giving the "Director" a real pair of hands.

Visual Consistency: Even though the video is generated by code, it doesn't look robotic. The layouts are dynamic, the typography is sharp, and the music feels intentional.

What we learned

Sandboxes are the New IDE: We learned that for agentic workflows to be reliable, they need more than just "tools"—they need full environments. Daytona provided that "room to breathe."

LLMs as Frontend Devs: Modern LLMs are surprisingly good at writing Remotion/React code because it is highly structured and declarative.

What's next for LaunchCut

Live Preview: A real-time stream of the Daytona sandbox so users can watch the AI "edit" the video live.

A/B Testing Narratives: Generating three different versions of a video with different "tones" (e.g., Professional, Hype-man, Minimalist) and letting the user choose.

Interactive Elements: Moving beyond MP4s to interactive web-based video experiences.

Built With

Share this project:

Updates