Inspiration
This was a pain point I wanted to solve for myself, to better improve my pronunciation of English words
What it does
It uses Gemini and ElevenLabs text to speech and speech to text models to help generate vowels and consonants and their correct pronunciation, so a user can pronounce those words too and get recommendations on how to improve their pronunciation of words
How we built it
I built it using Nextjs, Gemini, ElevenLabs and Javascript
Challenges we ran into
The ElevenLabs model has some lapses in its text to speech and speech to text apis, they sometimes hallucinate.
Accomplishments that we're proud of
The ability to get recommendations from an AI on how to improve pronunciation of words
What we learned
Never to use ElevenLabs apis fot tts and stt. Maybe next time we can use a paid service on google cloud or azure.
What's next for Speechify
Curated Assessments for users, based on a standardized curriculum, using a cloud based model that has a reduced error rate in their text to speech and speech to text api.
Built With
- elevenlabs
- gemini
- nextjs
Log in or sign up for Devpost to join the conversation.