Inspiration
We were deeply inspired by the beautiful, original song "lithium" by the renowned artist nobody likes you pat. As a dancer, choreographer, and video creator, my primary goal is always to find new ways to visualize movement and music. The inspiration for this project was to explore the intersection of human performance and artificial intelligence. We wanted to see if AI could be used not to replace the human element, but to amplify it—to act as a digital collaborator that could visually interpret the energy of the dance (choreographed and performed by myself) in a way that traditional effects cannot.
What it does
This project is a 3-minute music video for "lithium" that uses a sophisticated AI workflow to transfer the original human performance onto new visual styles. It translates the exact choreography into a dynamic, surreal visual layer. The video composites AI-generated imagery (created with models like Imagen 3) that is then driven by the original motion in my dance performance, creating a seamless blend of live-action and AI.
How we built it
Our workflow was a deliberate hybrid of human performance, a multi-model AI stack, and video editing.
- Human Foundation: The project is anchored by two original, non-AI elements: the song "lithium" by nobody likes you pat (used with permission) and the original choreography performed by myself.
- AI Core (ComfyUI running locally): AI processing was mainly handled via local compute. Our core process involved:
- Visual Composition: To create the video's unique aesthetic, we used Imagen 3. This model generated the high-quality reference images.
- Stylization: We used WAN 2.2 to stylize the original live-action performance in a ComfyUI flow. This model was crucial for transferring the desired artistic style onto our footage while preserving the complex dance movements, creating the final visual output.
- Post-Production & Compositing (Vegas Pro 21): All the raw AI outputs were brought into Vegas Pro 21. Its timeline and compositing features were used for meticulously layering, blending, and timing the shots to the music.
- Final Polish (CapCut): We used CapCut for the final stage. Its intuitive interface was perfect for adding quick, dynamic effects, and creating the punchy, final export.
Challenges we ran into
Our primary challenge was moving from a creative vision to a technical execution.
In ComfyUI we spent a significant amount of time modifying the workflow and references to tweak the AI's output. Faithfully capturing the fast, expressive choreography required some iteration and fine-tuning of the model's inputs.
The AI struggled with certain fast movements in the original footage. We addressed this through rerendering and video editing to minimize visible artifacts.
After generation, we meticulously sorted through all the clips and stitch together the best renditions to create a single, seamless performance. This was a manual, artistic process of compositing and editing to ensure the AI enhanced the dance, rather than distracting from it.
Accomplishments that we're proud of
We are incredibly proud of the final synthesis. This video doesn't just look "AI-generated"; it feels like a true collaboration. We successfully integrated a complex, multi-model AI workflow (WAN 2.2, Imagen 3) with professional editing tools (Vegas Pro 21, CapCut) to create a unique visual language for the song, all driven by the human performance.
What's next
We have an active channel on YouTube and will continue to post new AI-enhanced music videos, exploring this blend of human choreography and artificial intelligence.
Built With
- capcut
- comfy


Log in or sign up for Devpost to join the conversation.