Inspiration
We wanted to make chatting with an AI feel more real. Not just text on a screen, but a system that actually talks back in multiple languages. Basically, a multilingual conversation partner that feels alive.
What it does
You say something, the AI thinks, and then speaks the answer out loud. It remembers the conversation so it doesn’t feel like you’re starting from scratch every time. And it works in multiple languages, which is pretty cool.
How we built it
We used Google Gemini to generate the text responses and Eleven Labs to turn them into speech. Python scripts tie everything together: take the AI output, save it, and automatically send it to TTS. We also made sure your API keys are safe using .env files.
Challenges we ran into
We struggle in the beginning with Eleven labs because it had a new update and many tutorials were out-of-date. We fixed this by downloading a previous generation of ElevenLabs, to circumvent these problems.
Accomplishments that we're proud of
We are proud that it works, specifically the text-to-speech using Elevenlabs. It was definitely the most stressful and difficult part of the project.
What we learned
We learned a lot. This is all of our first AI project. It was definitely a big learning curve, from simple things like using a .env file to actually having audio files converting them to text files and having an AI response to it all.
What's next for Multilingual Conversation Partner
More languages and a faster and more human like response to the user.
Log in or sign up for Devpost to join the conversation.