Inspiration
Inspired and adapted by Robert Frost's poem "Stopping by Woods on a Snowy Evening", which was published on 1923.
What it does
This AI-powered music video fuses visuals and sound to reimagine Robert Frost’s classic poem "Stopping by Woods on a Snowy Evening." Using advanced AI generation technology, it transforms the poem’s serene imagery—snow-covered woods, the hush of night, and the tension between wanderlust and duty—into dreamlike visual scenes that interweave and resonate with an original score.
How we built it
The music was created using ElevenLabs Music. I input the text of Robert Frost’s poem directly into the platform without adding any prompts. ElevenLabs Music independently selected the musical style based on its understanding of the poem’s content. The track used in the video is a one-take generation, demonstrating the powerful comprehension capabilities of ElevenLabs Music.
The video was produced with Runway’s Aleph model. Boasting multimodal video editing features, Runway Aleph allows for flexible edits to uploaded footage, including modifying the environment, characters, angles, lighting, and other elements within the video. It can also generate new content through its "video-to-video" functionality.
This 2-minute music video was crafted using only 3 storyboard frames (attached in the Image Gallery of this project). All video content was "expanded and evolved" from the initial footage of these 3 frames.
The sound effects were also created with ElevenLabs.
Challenges we ran into
The character design is quite complex: it must include not only a person but also a horse, and maintain consistency.
I used multi-image reference capabilities. First, I designed the person and the horse separately, then combined them into the same image. Especially the horse, because there is a line in the poem: "he gives his harness bell a shake," so a bell must be worn. Moreover, it was while drafting the prompts for this character that I realized a saddle and a stirrup are two different things. In short, to maintain consistency, a lot of effort was invested in the character design.
Accomplishments that we're proud of
Finished this 2-minute music video by using only 3 storyboard frames.
What we learned
How to utilize the video editing capabilities of multimodal models, and how to design details for subsequent plot development during the character design phase.
What's next for Stopping by Woods on a Snowy Evening
Share and collect feedback to improve the technology of AI music generation, and compare the differences between the video editing capabilities of the Runway Aleph model and the traditional image-to-video production methods.
Built With
- capcut
- elevenlabs
- elevenlabsmusic
- runway
- runwayaleph

Log in or sign up for Devpost to join the conversation.