What it does
We created a desktop application that separates vocals, bass, and drums from a song uploaded on YouTube, then plays the song and video in real time. Additional features include synced song lyrics, a search bar with autocomplete, and track option selection.
How we built it
When a song is chosen by the user, the audio is downloaded from YouTube and split into chunks. These chunks are asynchronously split into vocal, bass, and drum tracks using the demucs source separation model API. These tracks are played overlapping each other, with the user being given the option to exclude specific tracks. We programmed a frontend GUI using PyQt to handle user input as well as lyric and video output. For user input, we implemented autocomplete by forwarding their text to YouTube’s search API and converting the chosen search to the YouTube link. We then download the audio from YouTube and handle audio output as mentioned above. Video output is handled using OpenCV, which gathers frames from the video that sync to the current audio position, then displays it to the screen. Lyrics are gathered from the syncedlyrics API, and are synced with the video and audio with the same method.
Challenges we ran into
The biggest challenge we faced was making our model process audio in real-time. This required us to pre-process our audio by cutting it into chunks, then asynchronously split the audio with our model. However, running python code asynchronously was another challenge we faced. This required research into multithreading and multiprocessing. The implementation we chose was to run each task in different child processes so that they could run simultaneously.
What's next for Okedoke Karaoke
The most major thing to work on is improving the performance of our application by reducing lag on the video display and improving the overlap between audio chunks. Additionally, we plan to work on adding more operating system accessibility, as there are currently some path issues for Windows. There are also a few features that we plan to add in the future, including sliders to control the volume of each track and accessibility for other platforms like Spotify.

Log in or sign up for Devpost to join the conversation.