Inspiration
There used to be a YouTuber I used to watch who created daily Spanish stories and would read them out load. His mantra was the only way to become fluent in a language is to first be able to understand what you're hearing.
What it does
auri sends users one short daily lesson (designed to be completed in under 10 minutes) via email .
Each lesson includes:
- An audio dictation, which users write down by hand while listening as many times as they need.
- A set of spoken comprehension questions, answered verbally.
- Gentle, real-time AI feedback on pronunciation and clarity.
- A full transcript at the end for review and reflection.
Lessons adapt subtly based on the user’s level and self-reported difficulty, focusing on realistic, everyday listening scenarios rather than drills or gamified exercises.
How we built it
auri is built around a simple daily loop triggered by email. Each lesson lives on the web and focuses entirely on audio interaction.
Under the hood, we designed a CEFR-based (Common European Framework of Reference for Languages) generation framework that guides the AI using explicit constraints: story length, speech pacing, inference level, question complexity, and discourse structure. This ensures lessons feel consistent and appropriate across levels.
The app was first designed and made in Google AI Studio, but then was transitioned over to Antigravity to better manage more complex workflows. Flash was the main model used with Gemini Pro 3 coming in clutch for more difficult pieces.
Also Gemini was used to generate the lessons as well as act as the LLM during speech-to-text and text-to-speech operations via Elevenlabs agents.
Challenges we ran into
- Figuring out how all the pieces fit was fun and challenging, i.e. story creation -> text-to-speech -> dialogue with AI, etc
- Figuring out permissions for google cloud storage
Accomplishments that we’re proud of
- Using both gemini and elevenlabs in the app was super cool
- Wrangling with google cloud's APIs to figure out how to store files (love you google but ugh docs are not super straightforward)
- Keeping things pretty minimal (there are so many cool ideas to build with this)
What we learned
- I learned how to pipe audio streams from one service to another (elevenlabs to google)
- I learned more about CEFR and was able to dig into the nuances of language fluency
What’s next for auri
I want to further customize the user's personal learning journey. Adding features like highlighting words in the transcript that the user didn't get right and then automatically turning them into anki cards would be amazing.
Built With
- elevenlabs
- gemini
- google-cloud
- tanstack
Log in or sign up for Devpost to join the conversation.