Inspiration
Inspired by synesthesia and muli-sensory experiences. Plan was to create a VR music video to enhance music listening experiences and this is the first step of that project.
What it does
Given a user-selected track on Spotify, it uses Spotify's audio analysis to inform prompts for a 3D Animation generated by AI.
How we built it
Uses a Jupyter Notebook to connect to Spotify's API and pull track data and identify tempo and beats. Then uses noodle soup to generate random prompts for the beats. Then utilizes Disco Diffusion's capability to take the given settings and prompts and generate a 3D animation, which can then be synced to the music and post-processed.
Challenges we ran into
Was planning on integrating into a React app, but was not feasible for the timeframe. Rendering and computation of animation frames can take a long time, so it was difficult to test the program, and running in Google Colab meant running into GPU limitations.
Accomplishments that we're proud of
Did not have any experience with Jupyter Notebooks prior to this project, so it's pretty cool that the whole project is built off of one.
What we learned
Jupyter Notebooks are pretty powerful and useful for machine learning and data analysis, Spotify's APIs are fairly easy to pick up because of the sheer amount of documentation. AI-generated art takes a long time to render.
What's next for AIMV
Creating prompts based on more audio analysis (sections of the track, danceability, energy, timbre, etc.) and using Whisper AI to pull lyrics and create prompts as well. Adding VR capabilities. Creating a React application to run program?
Built With
- disco-diffusion
- jupyter
- noodle-soup
- spotipy
Log in or sign up for Devpost to join the conversation.