Inspiration
We all have some songs from indie artists that we have wanted to sing, but it doesn't have the karaoke version yet. We created this web app to all have an easy and smooth karaoke experience. Just paste the YouTube URL of your favorite artist's video and sing the song!
What it does
- When you input the URL, the web app will call our backend API. The API will then:
- Download the video in mp3 format
- Split the vocal from the instrumental and save them into 2 different files on MongoDB using machine learning model called Spleeter
- The backend will then use the OpenAI Whisper audio AI model to extract the lyrics and get the timestamps along with it
- Return the instrumental file along with the lyrics json to frontend
- Our frontend will play the audio file along with the lyrics ## How we built it Backend:
- Split the splitted audio tracks using spleeter (machine learning model for splitting vocals from audio)
- Get the lyrics from the vocal file using SpeechToText
- Use AWS and Atlas MongoDB to store the ## Challenges we ran into Backend AWS Speech To Text dont well as expected. So did Google Cloud Speech to Text. We have to do it with OpenAI ## Accomplishments that we're proud of We were able to incorporate 2 machine learning library into our tech stack to create a smooth user experience ## What we learned Back end is very important and we should care about configurations for maintainability ## What's next for Keen Karoke
- Language learning with songs
- Bridge the gap in watching videos for people with hearing obstacle
Log in or sign up for Devpost to join the conversation.