Inspiration

As a mixed-heritage individual, I’ve often struggled with my identity. Not being able to speak the native tongue of either side of my heritage always bothered me, and I wish there were an entertaining way to learn my language. We believe language and music are central parts of identity, and Duosingo was born from the desire to turn that struggle into a bridge for cultural reclamation.

What it does

Duosingo is an AI-powered platform where "singing meets learning". Users upload any song and choose the language they wish to learn. The app generates a karaoke follow-along with translated lyrics, allowing the user to sing their way to fluency. Finally, Duosingo evaluates pronunciation and creates personalized learning plans to help users master their specific "Ghost Accents".

How we built it

We built a robust "Data → Insights → Action" loop using a sophisticated AI stack:

  • Audio Processing: Used ElevenLabs for transcribing stem-separated audio and dubbing isolated artist voices into new languages.
  • Lyric Standardization: Integrated the Genius API to import official lyrics as the standard for accuracy.
  • Linguistic Intelligence: Leveraged Gemini via Backboard.io to translate lyrics while maintaining tone and to compare user vocal responses against the correct phonetics.
  • Phonetic Analysis: Developed a workflow to phonemize both song and human audio to generate a real-time accuracy score.

Challenges we ran into

One of our primary challenges was overcoming technical hurdles in the phonetic comparison engine, where we initially dealt with incorrect outputs during the alignment of auditory lyrics to official standards. We also navigated complex SSL protocol violations on campus networks when streaming large audio files to our cloud APIs.

Accomplishments that we're proud of

We successfully created a Self-Improving AI model. By treating linguistic mistakes as behavioral data, our system iteratively personalizes language lessons, ensuring the product experience evolves specifically to address each user's unique struggle points.

We created a method of generating a real-time displaying lyric video with properly translated lyrics using Gemini API, ElevenLabs, and OpenAI Whisper.

What we learned

We learned that behavioral analytics can be a powerful driver for AI personalization. By tracking "phoneme struggle" events, we discovered we could move beyond simple rules-based logic to create a curriculum that feels truly alive and responsive to the user's needs.

What's next for Duosingo

The next step for Duosingo is expanding our self-improving model to support a wider variety of global languages and genres. We also plan to integrate real-time voice cloning to allow users to hear their own heritage songs sung back to them in their own mastered voice.

Built With

Share this project:

Updates