ID is a web series set in the shadowed alleys of 1495 Florence, blending historical drama with a psychological thriller narrative. The series explores the clash between the Renaissance ideals of the Medici and the puritanical fervor of Girolamo Savonarola, all while a serial killer uses mechanical automatons to weave death through the city.

Inspiration

The genesis of ID was born from a collision of technological breakthrough and historical fascination.

The Technological Spark: The project began when I gained access to Leonardo AI's creator program. Initially, I experimented with Renaissance aesthetics for fun, inserting myself and friends into alchemist robes. However, the release of Kling 1.6 changed everything. For the first time, I saw AI video generation that could maintain character consistency and fluid motion. It was no longer just a "moving painting"; it was cinema.

The Historical Context: I stumbled upon the history of Girolamo Savonarola, the Dominican friar who effectively ruled Florence after the Medici. His "Bonfire of the Vanities"—where art, books, and mirrors were burned—struck a chord. It paralleled modern anxieties about technology and the future. The tension between the "Old World" (medieval fear) and the "New World" (enlightenment and discovery) felt incredibly relevant to the current AI revolution.

Psychological Depth: The title ID refers to the Freudian concept of the primal psyche—the repository of our rawest drives. I wanted to explore what happens when a society represses its nature, and how that repression manifests as violence.

How it was Built: The AI Pipeline

The production of ID was unique not just in its tools, but in its transparency. The entire creation of the pilot and episode 2 was livestreamed on YouTube, turning the production into an open-source learning experience.

  1. Pre-Production & Scripting

The script was written in Arc Studio with assistance from ChatGPT for historical fact-checking (e.g., verifying the lineage of the Pazzi family or the specific chemical compounds an alchemist might use in 1494). We utilized Midjourney to create a comprehensive mood board, defining the lighting and texture of the "Dark Renaissance."

  1. Character Consistency (The Holy Grail)

To solve the biggest problem in AI video—consistency—we used a multi-step process:

Base Photography: We took photos of real actors (friends and myself) to capture authentic facial structures.

Training: We trained Flux LoRAs (using Freepik) on these datasets. This allowed us to generate the specific characters in any lighting or angle.

Costume Design: We used Kling and Kolors for "Virtual Try-On," effectively generating period-accurate costumes onto our character models.

  1. Video Generation & Animation

The visual heavy lifting was done by a combination of state-of-the-art models:

Kling AI (v2.1): Used for the majority of character movement and complex interactions.

Runway Gen-4: Utilized for image-to-video transitions and specific atmospheric shots.

Google Veo: Employed for crowd scenes and fire effects during the bonfire sequences.

HeyGen (v4) & Live Portrait: Used for lip-syncing and transferring nuanced facial performances from reference footage to the AI avatars.

  1. Audio Synthesis

Music: The soundtrack was composed using Udio, allowing for period-specific instrumentation (lutes, choirs) mixed with cinematic tension.

Voice: ElevenLabs v3 provided the voices, utilizing their new features to add directed emotion (whispers, shouts, and solemnity) to the dialogue.

  1. Post-Production

The disparate AI elements were composited in DaVinci Resolve Studio. We applied rigorous color grading to unify the footage from different AI models. Finally, the footage was upscaled to 4K at 60fps using Topaz Video AI.

Challenges Faced

Creating a coherent narrative with probabilistic tools presented significant hurdles:

The "Bonfire of the Vanities" Intro: I envisioned a wooden scale model of Florence burning. Midjourney struggled to adhere to this complex prompt, constantly confusing the scale.

Solution: I used ChatGPT's DALL-E 3 (via Agent Mode) to get the composition right, as it adheres better to complex instructions. I then passed that image through Magnific AI to transfer the realistic texture and style from our mood board, and finally refined it in Photoshop.

Nudity Filters: Attempting to recreate Renaissance masterpieces like David or The Birth of Venus triggered safety filters in commercial AI models, blocking the generation of classical statues.

Solution: We had to pivot creatively, choosing to represent themes through The Annunciation and other non-nude classical references, effectively "rewriting" the visual script on the fly.

Complex Interactions: Simple actions, like a cart passing in front of a door or a hand picking up an object, are incredibly difficult for video models. We often had to generate 3-4 seconds of video just to get 1 usable second, requiring heavy editing and stabilization in post.

What I Learned

The most profound realization was the democratization of production value. I learned that we are moving from a paradigm of "role-playing" filmmaker to actual filmmaking. The barrier is no longer budget or access to locations; it is purely imagination and curation. By documenting this journey, I hope to show that AI is not just a tool for automation, but a canvas for deep, human storytelling.

Built With

Share this project:

Updates