FluentEcho

Self pacing learning process with five pre-defined levels.
Gemini TTS powered practice phrase generation.
Personalized voice profiling with confidence score.
Granular pronunciation analysis that provides vital feedbacks.
Additional targeted practice powered by Gemini 3.
Additional targeted practice powered by Gemini 3.

Inspiration

Inspired by this YouTube video This is a Breakthrough....

What it does

First, the user records an audio clip of speaking the practice phrase. Then the app provides the Gemini Analyzed Pronunciation Analysis Results for the user, with additional practice phrase and pronunciation guidance. The main advantage of this app is its cost-efficiency and relatively high accuracy.

How we built it

The React SPA is vibe coded by AI Studio. The Synthetic benchmark pipeline (Python) is developed in VS Code. The core function is built upon Gemini 3 family's native audio understanding.

Challenges we ran into

Using synthetic data to conduct sanity check to make sure that the hallucination is minimal.
Reduce the false positive errors for the pronunciation analysis result. It is quite hard to tackle the false positive issue.

Accomplishments that we're proud of

Using synthetic data to test Gemini 3's native audio understanding capability.

What we learned

Processing audio data with Gemini 3. Mitigate this type of false positive errors.

What's next for FluentEcho

Gamification the levels and more nuanced practice generation.
Gather data (both synthetic and organic) for a rigorous benchmarking the multi modal LLMs' audio understanding.
After the false positive can be solved, get rid of all the phrases.
Speech training that utilizes Gemini's visual understanding ability.

Built With

Updates

G. Shawn started this project — Feb 09, 2026 12:05 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.