Inspiration
I took Spanish in high school, but have forgotten most of it by now, and I am not very comfortable speaking it with native speakers. With an upcoming trip to Spain, I found myself wishing that my Spanish was better, but I didn't have an easy way to practice speaking it. A friend suggested that I could use voice-mode in ChatGPT to do this and it works fairly well, but there were some drawbacks.
First of all, I wished I could see a transcript of what I said and what I heard because I can read the language better than I can hear it. Secondly, ChatGPT spoke too quickly for me and there was no way to slow its speech down. Lastly, there weren't any hints or other prompts to help me out if I got stuck.
I realized that I could create an app that would fix these issues and more. So I did, and mi español es mucho mejor ahora (my Spanish is much better now).
What it does
Improving your skills in a foreign language requires active listening and speaking with native speakers. Articulate is the first app designed to help you achieve this through a speech-first interface featuring engaging AI tutors who keep the conversation flowing and provide you with helpful corrections when needed. In fact, AI tutors are better in several ways because they are always available and there is no need to be embarrassed if you make a mistake!
How we built it
I started out with an audio-only app which still had the high quality speech-to-text and text-to-speech required for hearing from and speaking to the AI tutor. But then I looked into adding AI-based avatars, which are mostly video-based, and they made the interactions much more engaging. Unfortunately the price of AI avatar generated video would make the cost of the app unsustainable if people used it regularly, so I instead was able to use 3D-rendered avatars in the app to provide a similar experience.
Challenges we ran into
One challenge was providing minimal latency in responses. Many serverless back-ends suffer from cold-start issues and also buffer results which resulted in not being able to properly stream the responses back without having to wait for everything to complete. I tried out 6 or 7 different options (AWS Lambda, Google Functions, Azure Functions, Cloudflare Workers, Heroku, and more) and they all had the same issue. I finally settled on AWS App Runner which is similar to Heroku but didn't overly buffer the responses and provided acceptable latency.
Accomplishments that we're proud of
I am proud that I was able to provide such a rich and complex set of features that resulted in a really engaging experience without any help. I entrusted the great Matthew Skiles with creating a great app icon, but other than that, I designed and coded the entire app and back-end. It is definitely the best app I have ever made.
What we learned
One thing that I learned is that there are a plethora of options available for AI-based services to choose from and much of the work I did refining the app was trying them out and seeing which worked best. There are lots of different options for high quality speech-to-text, chat bot responses, and text-to-speech, and I think I've tried just about all of them.
Another thing I learned was that it can be really hard to control chatbot responses. Sometimes they are great and just what you asked for, and other times they go completely off the rails. For instance, if you tell the AI to correct the user in their native language instead of the foreign language, sometimes it works and sometimes it doesn't. And sometimes it will just continue the conversation in the user's native language instead of the language they want to practice. So I learned it is better to try to avoid ways that the chatbot might get off the rails and do things like translate the corrections if needed outside of the conversation.
What's next for Articulate: AI Language Tutor
First, I want to get feedback and see how people like it. Then I have plans to improve it, like decreasing latency of responses even more and adding things like streaks and notifications to keep users engaged.
Built With
- amazon-web-services
- anthropic
- azure
- node.js
- openai
- revenuecat
- swiftui
- tts
- whisper

Log in or sign up for Devpost to join the conversation.