Inspiration

I love learning languages, and have ended up making a habit of trying to learn endangered ones. I get frustrated that a lot of the apps available on the market end up feeling like flashcards -- I can't practice speaking easily, and I don't get the change to stress what I want to talk about; my own life and interests. I wanted to see if I could build something that lets me accomplish that, with Gemini.

What it does

Jerome is a conversational language tutor built around Gemini's Live Voice API (gemini-2.5-flash-native-audio-preview). Live Voice is the core of the experience: instead of rigid turn-taking, learners hold a fluid, real-time conversation with an AI tutor — and because Gemini processes audio natively rather than routing through text-to-speech, it handles the phonetics of even lower-resource languages like Scots Gaelic, Welsh, and Te Reo Māori with surprising fidelity. For intermediate learners — who know grammar but freeze in real conversation — this is transformative. They can drift naturally between their target language and English mid-sentence to ask "how do I say...?", get an instant answer, and keep speaking, exactly as they would with a human tutor abroad.

Beyond Live Voice, Jerome uses Gemini 3 Flash for structured JSON extraction: after each session, the model analyzes the full conversation transcript to identify new vocabulary and — critically — detect auto-mastery, recognizing when a learner naturally used a previously-studied word correctly in spontaneous speech, updating their spaced-repetition record without manual self-grading. Gemini 2.5 Flash TTS provides audio playback in text-chat mode, and Google Search grounding enables fact-checked cultural context.

How we built it

This was almost entirely built via prompting in AI Studio. I had to patch up a few race conditions from time to time that were re-introduced by consecutive prompts.

Challenges we ran into

The UI/UX took a lot of back and forth and ideation, to make it seamless, and easy to use. I wanted a strong focus on speech, which was hard to get a good feeling for.

What's next for Jerome

This is the first thing I've built for a hackathon that I actually use day-to-day! I'm going to continue refining it, and I hope that I'll get the chance to share it with others, and see if they find it useful, too.

Built With

Share this project:

Updates