The Scale: A Surrealist Music Video About America's Weight Loss Industrial Complex

Inspiration

I've struggled with my weight my entire life. While I don't tie my self-worth to how I look, it does greatly impact my life because it impacts my health. In this last year of my 30s, I'm digging into why this has been such a lifelong challenge and actually trying to fix my health.

Why can I eat dairy in Europe without any issues, but in America I have immediate negative reactions? Why are 73% of Americans overweight or obese if this is just a matter of personal willpower? Why do the same corporations that manufacture hyperpalatable, addictive processed foods also own the diet programs and pharmaceutical solutions?

This isn't a bug. It's a feature. It's a closed-loop profit system:

  1. Government subsidizes addictive ingredients (corn, soy, sugar)
  2. Food scientists engineer "bliss point" products designed to override satiety
  3. Marketing normalizes overconsumption
  4. Medical establishment pathologizes the predictable result (obesity)
  5. Pharmaceutical companies sell expensive treatments
  6. We blame ourselves and buy more solutions
  7. Repeat

The weight loss industry is worth $4.5 trillion globally. These companies don't profit from our success. They profit from our failure and our shame.

I wanted to create something that captured both the deeply personal pain of this struggle and the systemic forces that create it. A music video felt like the perfect medium: emotional enough to resonate with anyone who's felt trapped in this cycle, but sharp enough to name what's really happening.

What I Learned

About the Issue:

  • The same parent companies often own both junk food brands and weight loss pharmaceutical divisions
  • Food scientists literally engineer addiction using precise salt-sugar-fat ratios called "bliss point"
  • Government agricultural subsidies overwhelmingly favor ingredients that contribute to obesity
  • The revolving door between FDA officials and pharmaceutical company board positions is well-documented
  • This isn't conspiracy theory, it's just capitalism doing what capitalism does: optimizing for profit

About AI Filmmaking:

  • Surrealism is AI's superpower: Rather than trying to create photorealistic narrative scenes (where AI often struggles with consistency), leaning into surreal, metaphorical imagery played to AI's strengths. The dreamlike quality became an asset, not a limitation.
  • Batch generation strategy matters: Generating all similar shots together (all hands, all corporate imagery, all performance shots) using consistent parameters created better visual cohesion than generating randomly.
  • Character consistency is possible but requires planning: Using Midjourney's character reference feature and maintaining the same seed values across shots helped keep the protagonist recognizable throughout.
  • Motion adds emotional weight: Even simple animations (scales cracking, hands crumbling to dust, numbers floating) transformed static images into visceral moments.
  • Lip sync technology has arrived: Kling Speak made the performance shots feel genuinely human. This was the breakthrough that let me create an authentic music video rather than just pretty pictures set to music.

How I Built It

1. Song Generation (Producer.ai)

I crafted a detailed prompt specifying:

  • Genre: Alt-pop/dark pop with emotional depth
  • Structure: Verse 1 (personal struggle) → Verse 2 (investigation/revelation) → Bridge (breaking point) → Liberation
  • Tone: Intimate and vulnerable building to powerful and defiant
  • Lyrical approach: Concrete imagery, specific details (weight loss drugs, scales, receipts), shift from "I" to "they" to "we"

I generated several variations with Producer.ai. I selected the version that best balanced catchiness with emotional rawness.

2. Visual Concept Development

Rather than literal storytelling, I chose surrealism to represent emotional truth:

  • Obsession: Scales multiplying and orbiting like planets, woman praying to scale display like religious icon
  • Investigation: Pulling thread from food label revealing massive corporate web, products bleeding sugar like wounds
  • System Revealed: Giant corporate hands force-feeding then selling pills, octopus creature with tentacles holding different products
  • Liberation: Scale cracking like ice beneath lifting foot, walking away from monument-sized scale, golden cage with open door

Each lyric line received its own metaphorical visual interpretation: 38 core shots total.

3. Image Generation (Midjourney v6)

Character consistency strategy:

  • Created master reference prompt for protagonist (42-year-old woman, specific physical details, consistent wardrobe)
  • Generated multiple character shots first using same seed
  • Used --cref parameter to reference best shots across all performance images
  • Kept lighting descriptors consistent ("natural window light," "dramatic uplighting")

Batch generation approach:

  • Generated all surreal object shots together (scales, pills, corporate imagery)
  • Generated all performance closeups together
  • Generated all wide/establishing shots together
  • Used consistent style parameters within each batch

Prompt engineering for motion:

  • Left 10% padding around frame edges for camera movement
  • Specified "slow motion," "floating," "crumbling" in prompts for shots I knew would be animated
  • Generated higher resolution for shots requiring zoom/pan

Total images generated: ~150 (selected 45 for final edit)

4. Lip Sync Performance (Kling Speak / Higgsfield)

  • Selected 8 key performance shots (vulnerable verse, powerful chorus, angry pre-chorus, triumphant finale)
  • Uploaded protagonist face images and audio stems
  • Kling Speak's AI generated realistic lip movements synced to vocals
  • This was the game-changer: turned static images into genuine performance

5. Motion Graphics & Animation (Runway ML, Pika)

Animated shots:

  • Scales multiplying and orbiting (Runway Gen-3)
  • Drawer opening with light orbs floating (Pika)
  • Woman falling through impossible space (Runway)
  • Corporate hands crumbling to dust (Pika)
  • Scale cracking beneath lifting foot (Runway)
  • Pills transforming into butterflies (Pika)

Static shots with motion:

  • Ken Burns effects (slow zoom/pan) on portraits and wide shots
  • Subtle breathing/wind effects added to static performance shots

6. Editing & Post-Production (CapCut)

Timeline structure:

  • Beat-synced cuts (every shot change aligned to musical beats or lyric phrases)
  • Color grading evolution: sterile blues (verse 1) → sickly greens (verse 2) → warm golds (finale)
  • Aspect ratio shifts: 16:9 for narrative, 9:16 for intimate moments, 2.35:1 for epic scale
  • Transition shots: Pills burning, receipts floating like leaves, logos melting

Audio mixing:

  • Balanced vocals against surround sound design
  • Added subtle ambient textures (corporate hum, breathing, environmental sounds)
  • Created space for silence in key revelatory moments

Text overlays:

  • End statistics: "73% of Americans are overweight or obese. The weight loss industry: $4.5 trillion. You do the math."
  • Minimal, impactful, documentary-style typography

Total edit time: ~40 hours across 2 weeks

Challenges I Faced

Challenge 1: Character Consistency Across 45+ Shots

Problem: Early generation attempts created a protagonist who looked different in every shot—different face, different hair, completely breaking immersion.

Solution:

  • Created a detailed master character prompt with specific physical traits
  • Generated 20 variations, selected the most consistent face as reference
  • Used Midjourney's --cref feature religiously on every subsequent shot
  • Kept wardrobe ultra-simple (black clothing) to reduce variables
  • Accepted that perfect consistency is impossible; aimed for "recognizably the same person"

Challenge 2: Surrealism vs. Coherence

Problem: How do you create visually surreal imagery that still tells a clear story? Too literal and it's boring; too abstract and viewers get lost.

Solution:

  • Every surreal image had to answer: "What emotion or concept does this represent?"
  • Created a visual logic system: corporate = cold/mechanical, liberation = organic/warm, truth = light breaking through
  • Tested early cuts with friends: "What do you think is happening here?" If they couldn't articulate it, I revised
  • Used familiar objects in unfamiliar ways (scales, pills, hands) rather than completely abstract shapes

Challenge 3: Lip Sync Uncanny Valley

Problem: Early attempts at lip sync looked creepy or obviously fake, which destroyed the emotional authenticity of performance shots.

Solution:

  • Kling Speak was a breakthrough—far more realistic than earlier tools I tried
  • Only used lip sync on medium close-ups, not extreme closeups (where flaws are more visible)
  • Chose shots with good lighting and front-facing angles for best results
  • Accepted that some shots would remain static—better to have powerful stills than bad animation
  • Intercut lip-synced performance with non-performance shots to manage viewer fatigue

Challenge 4: Pacing & Emotional Arc

Problem: With 38 shots across 3:30, how do you maintain visual interest without causing exhaustion? How do you build emotional momentum?

Solution:

  • Mapped song structure to visual intensity: start intimate, build to chaos, release to clari

Built With

  • capcut
  • higgsfield
  • kling
  • midjourney
  • producer.ai
Share this project:

Updates