Okedoke Karaoke

Devil Is a Lie

What it does

We created a desktop application that separates vocals, bass, and drums from a song uploaded on YouTube, then plays the song and video in real time. Additional features include synced song lyrics, a search bar with autocomplete, and track option selection.

How we built it

When a song is chosen by the user, the audio is downloaded from YouTube and split into chunks. These chunks are asynchronously split into vocal, bass, and drum tracks using the demucs source separation model API. These tracks are played overlapping each other, with the user being given the option to exclude specific tracks. We programmed a frontend GUI using PyQt to handle user input as well as lyric and video output. For user input, we implemented autocomplete by forwarding their text to YouTube’s search API and converting the chosen search to the YouTube link. We then download the audio from YouTube and handle audio output as mentioned above. Video output is handled using OpenCV, which gathers frames from the video that sync to the current audio position, then displays it to the screen. Lyrics are gathered from the syncedlyrics API, and are synced with the video and audio with the same method.

Challenges we ran into

The biggest challenge we faced was making our model process audio in real-time. This required us to pre-process our audio by cutting it into chunks, then asynchronously split the audio with our model. However, running python code asynchronously was another challenge we faced. This required research into multithreading and multiprocessing. The implementation we chose was to run each task in different child processes so that they could run simultaneously.

What's next for Okedoke Karaoke

The most major thing to work on is improving the performance of our application by reducing lag on the video display and improving the overlap between audio chunks. Additionally, we plan to work on adding more operating system accessibility, as there are currently some path issues for Windows. There are also a few features that we plan to add in the future, including sliders to control the volume of each track and accessibility for other platforms like Spotify.

Built With

demucs
opencv
pyqt
python
ytdl

Updates

Ziyang Chen started this project — Oct 20, 2024 12:46 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.