Inspiration

This was a pain point I wanted to solve for myself, to better improve my pronunciation of English words

What it does

It uses Gemini and ElevenLabs text to speech and speech to text models to help generate vowels and consonants and their correct pronunciation, so a user can pronounce those words too and get recommendations on how to improve their pronunciation of words

How we built it

I built it using Nextjs, Gemini, ElevenLabs and Javascript

Challenges we ran into

The ElevenLabs model has some lapses in its text to speech and speech to text apis, they sometimes hallucinate.

Accomplishments that we're proud of

The ability to get recommendations from an AI on how to improve pronunciation of words

What we learned

Never to use ElevenLabs apis fot tts and stt. Maybe next time we can use a paid service on google cloud or azure.

What's next for Speechify

Curated Assessments for users, based on a standardized curriculum, using a cloud based model that has a reduced error rate in their text to speech and speech to text api.

Built With

Share this project:

Updates