Joe Speaking Live: Gemini Speaking Coach

Inspiration

Joe Speaking comes from a personal problem. I grew up learning English in an environment focused on reading and writing, but not real speaking. Even after preparing for IELTS and later moving to Canada, I felt that speaking practice was still missing the most important part: useful feedback that actually changed how I practiced.

That is the idea behind Joe Speaking: not just more speaking practice, but better speaking practice.

For this challenge, I built Joe Speaking Live, a new Gemini Live project based on the strongest part of the Joe Speaking vision: a realistic speaking conversation that feels natural, gives immediate recap, and makes improvement visible when you retry the same topic.

What it does

Joe Speaking Live is a real-time speaking coach built for the Live Agents category.

A learner starts a live IELTS-style conversation with Gemini, answers naturally, and gets an immediate recap after the exchange. Instead of ending there, the app keeps the same topic grouped so the learner can practice it again and compare versions over time.

The core loop is simple:

start a live conversation
answer naturally by voice or text mode
get a recap
retry the same topic
see progress across attempts

That makes the experience feel more like deliberate practice and less like isolated one-off tests.

How we built it

Joe Speaking Live is a new standalone challenge build that reuses the Joe Speaking product experience while keeping the runtime focused and challenge-safe.

The app is built with Next.js and React, uses the Google GenAI SDK with Gemini Live, and is hosted on Google Cloud Run. The backend issues live session credentials and generates the recap. Secrets are stored in Google Secret Manager, and the deploy path runs through Cloud Build and Artifact Registry.

On the frontend, I reused the Joe Speaking landing page and in-app visual language directly so the experience still feels like Joe Speaking, not a generic hackathon prototype.

Challenges we ran into

The hardest part was not connecting to Gemini Live. It was preserving the Joe Speaking experience while making the submission simple, stable, and clearly compliant with the challenge.

The existing Joe Speaking product has broader flows and infrastructure, but this challenge entry needed to work as a focused Google Cloud-hosted build. That meant narrowing the product to the strongest user loop while still keeping the Joe Speaking feel.

Another challenge was live-session reliability and demo clarity. The best solution was to reduce scope to one strong loop that judges can understand quickly: live conversation, recap, retry, progression.

Accomplishments that we're proud of

Built a new Gemini Live project that clearly fits the Live Agents category
Preserved the Joe Speaking product identity instead of shipping a generic challenge UI
Delivered a public Google Cloud deployment using Gemini Live and the Google GenAI SDK
Kept the most important learning loop: conversation, recap, retry, improvement
Turned Joe Speaking into a focused, judge-friendly live demo

What we learned

The biggest lesson was that a great challenge submission needs clarity as much as capability. The strongest version of Joe Speaking for this challenge was not the entire product. It was one believable, high-signal loop that demonstrates what Joe Speaking is trying to do.

We also learned that Gemini Live works best when the experience is centered on a natural real-time interaction. Once the conversation loop is strong, the recap and retry flow becomes much more meaningful.