Inspiration
Approximately 2300 img, 1241 video (possibly more) 107 shots from the storyboard/approved scenes in the video (simple and very complex)
What it does
The pipeline for most shots is: photo generation -> final rendering -> upscaling -> final rendering -> video generation -> video upscaling -> editing + video upscaling (if necessary)
Ingredients: (ComfyUI, open-source, and API solutions) Gemini, SDXL, MJ, Qwen, Ideogram, Krea, Seedream, Wan, Seedance, Kling, Minimax, Reve Img, Banana, Minimax Audio, Auto-sfx, Pixverse, Sync, Multitalk, ElevenLabs, Adobe FireFly, Swapface /FaceFusion, Enhancor, Magnific, Topaz, ESRGUN, GFPGUN 3 Loras
Editors: Premiere, DSP Motion, VSTPlugs, AfterEffects, Photoshop.
How we built it
Challenges we ran into
Accomplishments that we're proud of
What we learned
~In brief (my opinion):~ I didn't like it - the wide shots are a mess (with topaz, of course), and no neural network can properly create strings for musical instruments, whether for violins, guitars, or harps. The details of the tree branches, the crowds of people, the hyperdynamics - TOPAZ is no match! We wrote to them about this issue back in February, but they haven't responded. Some shots could have been thrown out; the client actually liked it. What I liked - the dynamic shots, the close-ups, the medium shots, the physics, even a bit realistic in places.
Log in or sign up for Devpost to join the conversation.