Inspiration

$4.2 billion. That's how much SaaS companies waste annually re-recording product videos that are outdated within weeks.

The math is simple and brutal:

  • 26 million developers build web applications globally
  • 3.5 million SaaS products are actively maintained
  • Average product ships 47 UI updates per year
  • Each product video takes 4-8 hours to re-record
  • 92% of companies report their demo videos don't match their current product

That's not a workflow problem. That's an industry-wide crisis hiding in plain sight.

And it's accelerating—fast.

AI has compressed the build cycle from months to days. Gemini in Google AI Studio generates full-stack code in seconds. Claude Code ships features autonomously. Cursor, v0, Bolt, Lovable, Replit Agent—developers are building faster than ever before. A solo founder can ship a complete SaaS in a weekend. Teams deploy daily. Some deploy hourly.

But marketing can't keep up.

Every product video becomes a lie by the next sprint. Every demo is showing yesterday's UI. Every screenshot is from three versions ago. Companies are spending millions on marketing assets that actively mislead customers.

I watched a founder lose a $50,000 investment because his demo showed last week's UI. Eight hours of perfect recording, wasted. He lost to a competitor with an inferior product—but accurate screenshots.

This happens thousands of times per day. The same AI tools that let you build faster are making your marketing obsolete.

No solution exists. Loom can't auto-update recordings. Screen Studio can't sync with your codebase. Traditional video editing treats footage as static files—completely disconnected from the living product it represents.

Developers version-control code. Designers version-control Figma. But marketing videos? Frozen in time. Rotting in Google Drive. Lying to customers.

I asked: What if Gemini's multimodal intelligence could bridge this gap? What if videos were computed artifacts—generated directly from your codebase and automatically updated when you ship?

What if your React components weren't just UI, but the single source of truth for every product video?

That's Scenery. The world's first code-connected video platform—powered by 9 Gemini integrations.

Before Scenery After Scenery
Record video (8 hrs) Connect GitHub repo
Ship update Gemini generates video
Video is now wrong Ship update
Re-record (8 hrs) Video auto-updates
Repeat forever Never re-record again

For 3.5 million SaaS products shipping 47 updates per year, that's 650 million hours of re-recording eliminated.

Gemini lets you build in days. Scenery—built on Gemini 3—ensures your marketing keeps up.

What it does

Scenery is an AI-powered video generation platform for React applications. You sign in with GitHub, connect a repo, and Scenery:

  1. Clones and analyzes your codebase to discover React components
  2. Renders live previews in a real browser (Playwright) with 4-attempt recovery
  3. Generates professional product videos using a 4-agent Gemini orchestration system
  4. Auto-updates videos when your code changes (videos are code-connected)

The core problem it solves: Product videos go stale the moment you ship an update. Traditional video editing means re-recording and re-exporting every time your UI changes. Scenery keeps videos synchronized with your actual components.


8 Gemini Integrations

Every integration uses structured output with JSON schemas—no prompt-and-pray.

# Integration Model Purpose
1 Component Categorization Gemini 3 Pro Classifies components into 27 UI categories
2 Demo Props Generation Gemini 3 Pro Generates realistic props from TypeScript interfaces
3 Server→Client Transform Gemini 3 Pro Converts async Server Components to client-safe code
4 Tailwind→Inline CSS Gemini 2.0 Flash Converts Tailwind classes to inline styles
5 AI Preview Fallback Gemini 3 Pro Generates HTML when bundling fails (5000 thinking tokens)
6 Director Agent Gemini 3 Pro Plans video narrative with function calling
7 Scene Planner Agent Gemini 3 Pro Designs positions, animations, timing (parallel)
8 Refinement Agent Gemini 3 Pro Scores quality 0-100, iterates until ≥90
9 TTS Voiceover Gemini 2.5 Flash TTS Generates audio with 5 voice options

Features

GitHub Integration

  • Sign in with GitHub OAuth
  • Paste any public GitHub repo URL
  • Auto-clones and parses TypeScript/JSX components
  • Extracts props, types, and interactive elements
  • Supports Next.js 13/14/15 with App Router
  • Sync button to pull latest changes

Component Discovery Pipeline

  1. Scanner — Globs .tsx/.jsx files, ignores node_modules/tests/builds
  2. Parser — react-docgen-typescript extracts props (regex fallback for async components)
  3. Analyzer — Gemini categorizes into 27 UI categories with video showcase strategy
  4. Preview — 3-tier rendering: Playwright (95%) → SSR (70%) → AI fallback (50%)

Server Component Detection (263 regex patterns)

Detects and transforms Next.js Server Components across 15 categories:

Category Examples Count
Async export default async function, await 4
Database ORMs Prisma, Drizzle, Mongoose, Supabase, pg, mysql2 21
Auth Libraries next-auth, @clerk/nextjs/server, lucia 8
Node.js Built-ins fs, path, crypto, child_process 12
Next.js Server next/headers, cookies(), redirect() 10
File System readFileSync, writeFile 6
And 9 more... Email, payment, CMS, analytics, tRPC ~200

When detected → Gemini transforms to client-safe code with mock data.

Video Editor

Track Types: Text, Component, Video, Audio, Image, Cursor, Shape, Particles, Gradient, Film Grain, Vignette, Color Grade, Blob

30+ Animation Presets:

  • Entrance: fade-in, slide-in-*, zoom-in, bounce, elastic, spring-pop, blur-in, flip-in, rotate-in
  • Exit: fade-out, zoom-out, blur-out, slide-out-*
  • Emphasis: pulse, shake, wiggle, heartbeat, jello, glow
  • Motion: float, drift-right, ken-burns-zoom
  • Filter: color-pop, flash, hue-shift, cinematic-focus

Spring Physics (5 presets): | Preset | Damping | Stiffness | Mass | Use Case | |--------|---------|-----------|------|----------| | smooth | 200 | 100 | 1 | Professional | | snappy | 200 | 200 | 0.5 | Quick, responsive | | heavy | 200 | 80 | 5 | Slow, dramatic | | bouncy | 100 | 150 | 1 | Playful | | gentle | 300 | 60 | 2 | Soft, subtle |

6 Particle Effects: confetti, sparks, snow, bubbles, stars, dust

6 Cursor Interactions: click, hover, type, focus, select, check

Text Effects: Letter-by-letter animation, gradient fill, glow, glass backgrounds

Device Frames: phone, laptop, full display modes

Voiceover

  • Gemini 2.5 Flash TTS with responseModalities: ['AUDIO']
  • 5 prebuilt voices: Kore, Charon, Fenrir, Aoede, Puck
  • 24kHz WAV output synced to timeline

Export

  • Remotion Lambda (AWS) for serverless rendering
  • MP4, GIF, WebM output formats
  • Parallelized rendering (60s video → ~30s render)

How we built it

4-Agent Gemini Orchestration System

User: "Create a product video for our auth components"
                    │
                    ▼
┌─────────────────────────────────────────────────────┐
│  DIRECTOR AGENT (Gemini 3 Pro)                       │
│  Tool: create_video_plan                             │
│  • Plans narrative arc (Hook→Setup→Showcase→CTA)    │
│  • Frame budget: 15% hook, 15% setup,               │
│    55% showcase, 15% CTA                            │
│  • Outputs INTENTS ("dramatic", "professional")     │
└─────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────┐
│  SCENE PLANNER AGENT (parallel per scene)            │
│  Tool: create_detailed_scene                         │
│  • Translates intents → spring animations            │
│  • Element positions (0-1 normalized)               │
│  • Keyframes (RELATIVE to element start)            │
│  • Cursor paths for tutorials                        │
│  • Narration scripts                                 │
└─────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────┐
│  ASSEMBLY AGENT (deterministic, no LLM)              │
│  • Fixes keyframe timing mistakes (rescales >90)    │
│  • Normalizes keyframe format                        │
│  • Organizes tracks by z-order                       │
│  • Validates composition                             │
└─────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────┐
│  REFINEMENT AGENT (Gemini 3 Pro)                     │
│  Tool: analyze_composition                           │
│                                                      │
│  Scoring Weights (0-100):                           │
│  • Visual Composition: 30%                          │
│  • Timing: 25%                                      │
│  • Narrative Flow: 25%                              │
│  • Animation Quality: 15%                           │
│  • Accessibility: 5%                                │
│                                                      │
│  Score < 40 → Regenerate from Director              │
│  Score 40-89 → Apply fixes, re-evaluate             │
│  Score ≥ 90 → Ship it                               │
│                                                      │
│  Max iterations: 5 (picks best version)             │
└─────────────────────────────────────────────────────┘

Component Preview Pipeline (3-tier fallback with 4-attempt recovery)

TIER 1: Playwright Browser (95% accuracy)
  • Transform to pure React (Gemini removes external deps)
  • Bundle with esbuild (50+ library mocks)
  • Render in real Chromium on Fly.io worker
  • 4-attempt recovery with progressive simplification:
    - Attempt 1: fix-props (add missing definitions)
    - Attempt 2: simplify (flat structures, empty arrays)
    - Attempt 3: minimal (aggressive mocking)
    - Attempt 4: placeholder div
    - Attempt 5+: skip component
        │
        │ Fails?
        ▼
TIER 2: Server-Side Rendering (70% accuracy)
  • renderToStaticMarkup via esbuild
  • 5-second timeout
        │
        │ Fails?
        ▼
TIER 3: AI-Only Generation (50% accuracy)
  • Gemini generates HTML from source code
  • thinkingBudget: 5000 tokens for reasoning

Tech Stack

  • Frontend: Next.js 15, React 19, TypeScript
  • AI: Gemini 3 Pro (agents), Gemini 2.0 Flash (quick ops), Gemini 2.5 Flash TTS
  • Video: Remotion 4 + AWS Lambda
  • Component Rendering: Playwright worker on Fly.io (separate app)
  • Bundling: esbuild with 50+ virtual module mocks
  • Database: Supabase (PostgreSQL)
  • Auth: Supabase Auth with GitHub OAuth
  • Hosting: Fly.io (2 apps: main + Playwright worker, auto-scale)
  • State: Zustand + Zundo (undo/redo)
  • Validation: Zod schemas for all AI outputs

Gemini Features Used

Feature Implementation
Structured Output JSON schemas with Zod validation on ALL 8 integrations
Function Calling create_video_plan, create_detailed_scene, analyze_composition
Thinking Mode Preview generation fallback (5000 token budget)
TTS responseModalities: ['AUDIO'] with 5 voice options
Streaming Real-time chat responses via Server-Sent Events
Long Context Full source code analysis (900s timeout)
Gemini 3 Pro All 4 agents, Server→Client transformation, categorization
Gemini 2.0 Flash Tailwind→CSS conversion (speed optimized)
Gemini 2.5 Flash TTS Voiceover generation

Challenges we ran into

Server Component Detection: Next.js 13+ Server Components crash in browsers. We built detection with 263 regex patterns across 15 categories (Prisma, Drizzle, NextAuth, Clerk, Node builtins, etc.), then use Gemini to transform to client-safe code with mock data.

Multi-Agent Coordination: Director outputs intents ("dramatic entrance"), Scene Planner translates to specifics (spring configs, exact keyframes). Assembly is pure TypeScript to prevent AI error cascades.

Relative vs Absolute Keyframes: AI kept outputting absolute frames (frame 300 for a fade-in = 10 seconds!). We enforce relative keyframes where frame 0 = when THIS element appears, not video start. Assembly agent auto-fixes violations.

Self-Correction Without Infinite Loops: Weighted scoring with hard thresholds. Score < 40 = regenerate from scratch. Score 40-89 = patch. Score ≥90 = ship. Max 5 iterations, then pick the best version seen.

Preview Verification: Playwright sometimes captures loading states or skeletons. Gemini verifies if the HTML actually represents the component. If invalid, falls back to AI-only generation.

Rate Limits: Added fail-fast detection for 429/quota errors—no retries on rate limits, clear user messaging.


Accomplishments we're proud of

  • 8 distinct Gemini integrations with structured output and function calling
  • 4-agent orchestration with self-correcting refinement loop (score ≥90 to ship)
  • 263 Server Component patterns making Scenery work with any Next.js codebase
  • 3-tier preview system with 4-attempt progressive recovery per component
  • 30+ animation presets with 5 spring physics configurations
  • Code-connected videos that auto-update when repos change

What we learned

Structured output > free-form prompts. JSON schemas with Zod validation = 100% parse success rate.

Function calling > prompts for agents. Explicit tools (create_video_plan) enforce exact output shapes.

Intents > specifics for planning. Director says "dramatic", Scene Planner decides "spring-bounce with bouncy preset".

Deterministic steps prevent cascades. Assembly Agent uses zero LLM—just TypeScript transforms and validation.

Verification catches bad renders. AI checking AI output catches loading states, empty renders, skeletons.

Fail fast on rate limits. Detect 429/quota errors immediately, don't waste retries.


What's next for Scenery

  • Vue/Svelte support (parser abstraction)
  • Template marketplace (pre-built video styles)
  • CI/CD integration for auto-generated videos on deploy
  • Collaborative editing (multiplayer timeline)
  • Version history for compositions

Third-Party Integrations

Remotion

Built With

  • aws-lambda
  • esbuild
  • fly.io
  • gemini-2.5-flash-tts
  • gemini-3-flash
  • gemini-3-pro
  • next.js-15
  • playwright
  • react-19
  • remotion
  • supabase
  • tailwind
  • typescript
  • zod
  • zustand
+ 5 more
Share this project:

Updates