Inspiration

Marc Andreessen and Elon Musk predicted on Friday that the "future" would be "real-time generated video specific to the user". So we asked ourselves: why wait for the future when we can build it now?

Modern ads are fundamentally built on stealing attention. But users have adapted: we skip, we block, we ignore. Billions are spent on optimizing targeting, yet the core format remains flawed. Even perfectly targeted ads fail because they interrupt the experience.

That's why product placement exists, because it doesn't feel like advertising. But today, it's static, manual, and impossible to scale or personalize.

That led us to a new question: what if ads weren't shown around content to just viewers that align with it, but generated inside it, uniquely so every viewer aligns with it.

What it does

Splyce is a generative media platform that seamlessly embeds brand moments directly into video content, transforming traditional ads into natural parts of the story. By analyzing scenes, dialogue, and visual context, it identifies the perfect moments to insert products as in-world elements, whether through subtle dialogue, props, or environmental details, so they feel like authentic character choices rather than interruptions. The result is a new form of advertising that blends invisibly into films, shows, and creator content, preserving immersion while unlocking high-impact monetization.

How we built it

We built Splyce as a JavaScript-based pipeline that processes video clips end-to-end, from scene understanding to final edited output. On the backend, we used the Gemini API to analyze frames and dialogue, extract context (objects, tone, timing), and determine where a product could be naturally integrated. We then generated context-aware insertions (both visual and textual) and used lightweight video processing to splice those into the original clip while preserving continuity. For dialogue-level integrations, we used ElevenLabs to generate voice lines that match the original speaker’s tone and pacing. The entire system is orchestrated as a modular pipeline , allowing us to quickly iterate and produce seamless, story-native ad edits in just a few seconds.

Challenges we ran into

The biggest challenge was making the integrations feel truly invisible rather than like obvious edits. Even small mismatches in lighting, timing, or dialogue tone break immersion, so we aimed to carefully align visual and audio outputs. Working with multimodal models also meant handling inconsistencies between scene understanding and generation, especially under time constraints.

Accomplishments that we're proud of

We took a concept that’s usually talked about as “future tech” and actually made it real in a weekend. We built a full end-to-end pipeline that goes from raw video → scene understanding → personalization → context-aware ad generation → final rendered output, all in seconds. Getting Gemini and ElevenLabs to work together cohesively for multimodal understanding + generation was also a big win.

What we learned

We learned that context is everything in generative media; good integrations don’t come from just detecting objects, but rather from understanding narrative intent, character behavior, and scene timing. That pushed us to think beyond simple computer vision and treat the problem as multimodal reasoning across visuals, dialogue, and story structure.

What's next for Splyce

Next, we want to improve realism and scalability by refining video editing quality, reducing latency, and expanding personalization so each viewer sees different brand integrations in real time. We also want to build a cleaner interface for creators and studios to upload content and control how integrations are applied. Long term, we see Splyce becoming a new monetization layer for media.

Built With

Share this project:

Updates