Turn any YouTube video into a full-length animated video

Screenshot showing 57GB RAM usage. Cartoon LCM, 2.89 it/s, includes upscaler
Screenshot of animation of colourised LOST IN SPACE. Permissive Beauty LCM, 2.98 it/s - no upscaler. LCM Models have different Schedulers
Final screenshot, Permissive Beauty LCM plus upscaler. These settings will also work on lower RAM machines at a much slower (snail) speed
Screenshot of Library for Stable Diffusion Models
Screenshot of Library of Upscalers. Only RealESGRAN 2x works on my system & even then it is a major bottleneck
Screenshot of large number of selectable Schedulers for LCM's. Not availble for Permissive Beauty LCM
A look under the hood. The Amuse software is the bright yellow objects in the top left corner

Inspiration: Began using AI. commercial video-style-transfer services in my retirement to animate short segments of old episodes of Lost in Space for fun. I researched for an affordable AI solution that I could run locally on my cheap AliExpress Mini PC purchased for under AUD$999.

What it aims to achieve: Optimized the AMD Amuse video frame-to-frame Permissive Beauty Latent Consistency Controlnet to enable output of high-quality animated video from any source for up to 10 - 15 continuous minutes.

How we configured it: Ryzen 7 (Model 255) 94GB RAM Minisforum Mini PC (Aliexpress AUD$900). One objective was to avoid the need for an expensive graphics card and utilise the low-spec integrated Radeon iGPU graphics. Also functionally fully tested on a Ryzen 9 16GB laptop (with unviable long processing times)

Major Challenges: Maintaining system stability over long processing times. Initial processing attempts produced shoddy results from extremely long processing sessions and frequent system crashes. Zoning in on the correct setting ranges for the various LCM's was initially frustrating; however, once familiarity with each LCM was gained the setting zone ranges for each model became simpler and natural.

Accomplishments: Through extensive optimization testing have increased the processing efficiency and output quality on the cheap Ryzen 7. Initial processing attempts began at unusable and unstable 0.50 to 0.80 it/s. Final optimizations have improved processing to a very impressive 5.00+ it/s. This enables a 3-5 minute upscaled video to be processed in 3-5 hours. According to Grok, this is good performance for local AI video synthesis using LCM + ControlNet. The process also surprisingly runs on a 16GB Ryzen laptop. The processing speed is 10 times slower, leading to unrealistic rendering times of beyond 5 hours on the low quality settings.

What else was learned: It is easy to include additional prompts to modify the processing further. Fun examples include changing the age, gender, or nationality of the conversion. Including prompts for "science fiction", "Astroboy", etc, enables very strange interpretations and fun outputs. This example has been left unmodified to more closely mimic the original video. The Minisforum Mini PC consumes less than 100 watts when under full load. Cost of 3-5 minutes a.i. conversion is less than a kilowatt or <AUD$0.30.

What's next for Turn any YouTube video into a full-length animated video: I am in the process of completing a full 4K colour animation of the unaired LOST IN SPACE one-hour pilot from 1965. I would then like to do a make-over of THE ROCKY HORROR PICTURE SHOW.

Further research has indicated that a simple swap-out of my Ryzen 7 for a bare-bones Ryzen 9 AI. from AliExpress for AUD$950, & by retaining my existing fast RAM & SSD will provide a cheap pathway to a vastly superior. Estimated it/s almost double my current performance limits. The more dramatic performance boost will come from accessing the R9 AI on-board neural capabilities to exploit fully the upscaling. That is a current massive bottleneck for my configuration and a drag on output quality.

This upgrade path will also provide support for future NPU software capabilities that continue to expand throughout the AI. ecosytem. The solution proves remarkable quality & cost efficiency for what it does, without a GPU and running costs of less than AUD$0.10 per hour.

Built With

amd
amuse
avclabs
clipchant
controlnet
pixbim
powerdirector
snippingtool
upscaler

Updates

Roy Gardnerra posted an update — Nov 02, 2025 09:25 AM EST

After a conversation with Grok, the following upgrade path to swap out my Ryzen 7 (255) with a Ryzen Ai 9 Hx370 bare bones will give me the best bang for my buck :

Absolutely — let’s break down that sentence clearly and show exactly how the NPU (Neural Processing Unit) in the Ryzen AI 9 HX 370 helps offload preprocessing tasks like upscaling in your Amuse frame-to-frame video workflow.

The Sentence:

"NPU acceleration offloads preprocessing (e.g., upscaling)"

This means:
The NPU (a dedicated AI chip inside the HX 370) takes over specific AI-heavy steps before the main video generation happens — so the iGPU (Radeon 890M) doesn’t have to do them. This frees up the iGPU to focus 100% on generating new video frames faster.

Step-by-Step: How It Works in Amuse

Step in Your Workflow	Who Does It? (Current Setup: Ryzen 7 780M)	Who Does It? (New: Ryzen AI 9 HX 370)	Speed Gain
1. Input Video → Extract Frames	CPU	CPU	Same
2. Upscale Low-Res Frames (e.g., 256×256 → 512×512)	iGPU (780M)	NPU (50 TOPS)	+30–60% faster
3. Run ControlNet / Depth / Pose Extraction	iGPU	NPU + iGPU (shared)	+20–40%
4. Generate New Frame (Diffusion)	iGPU	iGPU only (now free!)	+15–25% it/s
5. Blend & Output	iGPU	iGPU	Minor gain

Example: Upscaling with NPU

Let’s say your source video is 480p (low quality), and Amuse needs 512×512 input for consistency.

On Your Current Ryzen 7 (780M iGPU):

The iGPU must:
1. Load frame
2. Run AI upscaler (like Swin2SR or Real-ESRGAN)
3. Output upscaled frame
4. Then start diffusion
Problem: The iGPU is busy upscaling → can't generate new frames at the same time → bottleneck

On Ryzen AI 9 HX 370:

The NPU takes over upscaling using optimized INT8/FP16 AI kernels
The iGPU is now free to run LCM diffusion + ControlNet immediately
Result: Preprocessing finishes 2–3x faster, and generation starts sooner

Real benchmark (AMD lab data, Amuse 3.0):
Upscaling 100 frames (512×512)

780M iGPU: ~18 seconds

HX 370 NPU: ~6–8 seconds
→ Saves 10–12 seconds per 3-second clip

Other Preprocessing Tasks the NPU Handles

Task	Why It’s AI-Heavy	NPU Benefit
ControlNet Preprocessing (edge/depth/pose maps)	Runs small neural nets per frame	Offloaded → iGPU focuses on diffusion
Face Detection / Restoration	Uses YOLO or SCRFD models	NPU runs in parallel
Optical Flow Estimation	For frame blending	NPU accelerates motion vectors
Smart Cropping / Resizing	AI-guided aspect ratio fixes	Faster, smarter

Real-World Impact on Your 5 it/s → 7–9 it/s

Factor	Contribution to Speedup
890M iGPU (16 CUs vs 12)	+30–40% raw compute
NPU offloading upscaling & ControlNet	+15–20% effective it/s
Faster RAM (LPDDR5x-8000)	+5–10% memory bandwidth
Zen 5 CPU (better scheduling)	+5%
Total	~40–80% faster → 7–9 it/s

Bonus: Future-Proofing

Amuse 3.1+ and Stable Diffusion 3.5 are adding NPU-native pipelines:

--pipeline: onnx-npu   # New in Amuse 3.1 beta
--offload: upscaler, controlnet, vae

→ Your HX 370 will automatically use NPU for more tasks as updates roll out.

Summary: Why NPU Matters

The NPU doesn’t replace the iGPU — it unblocks it.
By handling preprocessing (upscaling, ControlNet, etc.), it lets the Radeon 890M focus only on generating video frames — which is the hardest, slowest part.

Result: Smoother workflow, less stuttering, higher it/s, and shorter total render times — even if peak theoretical TFLOPS aren’t 3x higher.

Bottom Line:
Yes — the NPU offloading preprocessing (especially upscaling) is a real, measurable speed boost in Amuse. It turns your iGPU from a "jack of all trades" into a dedicated video generation engine.

If you're doing any upscaling, ControlNet, or face restoration, the HX 370’s NPU will save to your workflow in a way a dGPU + OCuLink can’t match for efficiency and simplicity.

Let me know if you want a sample Amuse config to enable NPU offload!

Log in or sign up for Devpost to join the conversation.

Roy Gardnerra posted an update — Nov 02, 2025 05:27 AM EST

Based on Grok calculations (see below) the addition of a AMD RX9000 graphics card see the it/s rise from current 5 it/s up to 20 it/s:

Based on the screenshot and your description, you're running Amuse 3.0 beta (optimized for AMD hardware) in frame-to-frame video generation mode using the Permissive Beauty LCM model with ControlNet for consistency. This is a diffusion-based workflow (text-to-video restyling via Stable Diffusion variants), running locally on your Ryzen 7 H 255's integrated Radeon 780M iGPU (RDNA 3 architecture, 12 compute units, sharing system RAM). Your setup's 5 iterations/second (it/s) at 512x512 resolution, 12 guidance steps, and ~1.0 scale is solid for an iGPU—it's leveraging ROCm acceleration via Amuse's ONNX backend, but constrained by the 780M's ~8-10 TFLOPS FP16 compute and shared 94GB RAM bandwidth.

Adding an AMD Radeon RX 9000-series discrete GPU (RDNA 4 architecture, e.g., RX 9070 or 9070 XT) via OCuLink (a high-bandwidth PCIe 4.0 x4 external connection, ~64 Gbps) would offload the diffusion inference to dedicated VRAM and vastly higher compute. Amuse fully supports dGPUs like the RX 9000 series with AMD-optimized models, and OCuLink introduces minimal overhead (<5% bandwidth loss vs. internal PCIe) for these workloads. Here's the expected performance uplift:

Key Factors Influencing the Gain

Compute and Memory: The 780M's shared RAM limits batch sizes and parallelism in video diffusion (frame interpolation + ControlNet). An RX 9070 XT offers ~40-50 TFLOPS FP16 (with sparsity), 16GB GDDR6 VRAM, and dedicated bandwidth—ideal for video models like yours, reducing memory swaps.
Amuse Optimizations: AMD's collaboration with Stability AI provides up to 3.1x speedup on RX 9000-series hardware for SDXL-based workflows (your LCM variant is similar). Video generation sees comparable gains, as it builds on the same diffusion backbone.
Model/Workload Scaling: Frame-to-frame at 512x512 with 12 steps is lightweight; gains are higher for video vs. single-image gen due to sequential frame processing.
Benchmarks: Real-world tests (e.g., Amuse 3.0 on Ryzen AI iGPUs like 780M/880M) show 2-4 it/s for similar video clips. RX 9070 XT benchmarks hit 15-25 it/s for SDXL video restyling (e.g., 2-6s clips at draft quality).
OCuLink Caveats: Expect 5-10% lower peak vs. internal mounting due to connector latency, but it's negligible for diffusion (not real-time rendering).

Likely Performance Gain

You're likely to see 3x to 4x higher it/s with an RX 9000-series card—pushing your workflow to 15-20 it/s (or higher with tweaks like enabling frame blending or AMD-optimized SD 3.5 models). This assumes:

RX 9070 XT (top of the series; RX 9070 would be ~10-15% slower).
Amuse set to use the dGPU (auto-detected; confirm in Model Manager).
Same settings (resolution/steps); scaling to 1024x1024 could yield even bigger relative gains.

Setup	Est. it/s (Your Workflow)	Notes
Current (780M iGPU)	5	Shared RAM bottleneck; good for draft video.
+ RX 9070	12-16	Mid-range uplift; 16GB VRAM enables larger batches.
+ RX 9070 XT	15-20	Full optimization; up to 4.3x vs. unoptimized baselines.

Recommendations to Maximize Gains

Update Drivers/Software: Use latest AMD Adrenalin (25.10+ for RDNA 4) and Amuse 3.1 beta. Enable "AMD GPU EP" in settings for ROCm acceleration.
Test Prompt: Your "youthful handsome gladiator with foreskin" is fine—keep guidance scale ~1.0 to avoid overcooking.
Monitor: Use MSI Afterburner for VRAM usage; if >14GB, drop to LCM Turbo model.
Alternatives if Needed: If OCuLink throttles (rare), consider Thunderbolt enclosure, but stick to AMD for Amuse compatibility.

This upgrade would make video gen feel snappier, cutting a 24-frame clip from ~5 minutes to ~1-2 minutes. If your "Ryzen 7 model 255" is actually the H 255 variant (China-exclusive, no NPU), gains are even more pronounced since it lacks AI offload. Let me know exact RX 9000 model for finer tuning!

Log in or sign up for Devpost to join the conversation.

Roy Gardnerra posted an update — Nov 01, 2025 08:20 AM EDT

Workflow Summary:

Prepare source video with all necessary enhancements. Colorisation, stablisation, colour and lightening enhancement etc. I have used AVC Labs video enhancement in preparing all the videos to add an exagerated "pop" element to the results.
In Amuse Video Generation Expert Mode, import prepared video - unstable beyond 10 minutes
Select FRAME TO FRAME
Select previously downloaded PERMISSIVE BEAUTY LCM
Select previously downloaded Upscaler REALESRGAN 2X
Load selected models
Load chosen video
Choose Video FPS Output - 15fps is very good and fast. 30fps is slower and slightly higher quality
Select - 8 through to 12 is fine. Much higher than 12 and some things start to get silly
Guidance - leave an 1. Any use of this control will slow the it/s down by 40-50%. OK to use for short videos for exagerated results, but requires longer processing times.
Strength is the "magic sauce" and seems to fuse all the other settings together. I have had success within the range of 14 through to 45 for these projects. My default starting number is 23 and moving higher or lower from there.
FRAME BLENDING is not available for this LCM, however with some of the other LCM's its impact is very mild - like adding an extra spice in the magic sauce, not totally necessary, but an interesting twist. The LIGHTEN frame blending appeared to add a certain "je ne sais quoi" to the output. Far easier to include the "je ne sais quoi " in the video preparation stage imho! Grok has a lovely description of them that talks about beta waves and theta curves or something.
Enter prompt. I have only ever tested basic prompts such as male, handsome, hairy, old, young etc. Without the prompt for male, the LCM has a definate default swing towards to Japanese cult female anime!
Select GENERATE , and prepare for a very long long wait. About 3-5 hours at best. 3 hours for a 512x512 output (4.8 it/s) or 5 hours for a 768x512 output (2.7 it/s)
If Upscaler is not selected, processing times are improved by about 45%, but final output is significantly downgraded - may be useful for DRAFT outputs.
Once file has been processed, use regular video editing software to recombine sound from original clip to AI. video clip. If FPS was alterred during processing, use video editing to match video timelines. I use PowerDirector 365 and PowerDirector 21 for most video editing - mainly as I have been using it since version 7 (2001)

Log in or sign up for Devpost to join the conversation.

Roy Gardnerra posted an update — Nov 01, 2025 07:34 AM EDT

AMD Ryzen 7 and 9 with integrated Radeon Graphics tested (see Task Manager Screenshot) AMD Ryzen 7 and 9 both have Compute 0 and Compute 1 engines that appear to assist with the prompt processing (Compute 0) and step processing (Compute 1) AMD Ryzen 7 has 94GB installed RAM dynamically system managed and performs stable up to 5+ it/s AMD Ryzen 9 has 16GB installed RAM. Plods along at a rock solid stable 0.3 it/s. Able to multi-task, relying on the Nividia Graphics to releave the Radeon graphics from having to contribute.

Log in or sign up for Devpost to join the conversation.

Roy Gardnerra posted an update — Nov 01, 2025 07:18 AM EDT

Stable Diffusion Models researched and tested (see image screenshot) All have unique characteristics and optimzations that may favour different video-style-transfer goals. For my project the Permissive Beauty LCM matched to the Upscaler was a clear winner. There is an insane number of Stable Diffusion models to experiment with in the Amuse Model Library : Fluently v4 LCM Permissive Beauty LCM Dark Sushi Mix Neverending Dream Real Cartoon 3D Yes Mix

Upscaler used - RealESGRAN 2x. This was the only Upscaler of the 10 available that would work satisfactorily on my system.

Log in or sign up for Devpost to join the conversation.

Roy Gardnerra posted an update — Oct 31, 2025 07:35 AM EDT

An unusual side project has been the discovery that the process works exceptionally well for old adult VHS Tapes due to the unique production techniques used in filming the subjects. The Permissive Beauty LCM appears optimized to highlight the human body. With the framing of VHS scenes usually close to the camera, the results are consistently high-quality quality simple animations with limited background to muddy the results.

Log in or sign up for Devpost to join the conversation.

Roy Gardnerra started this project — Oct 29, 2025 08:14 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.