Inspiration
Many English learners struggle to practice real speaking because they don’t have a partner or teacher available all the time. I wanted to build something that feels like talking to a real person, but is always available and gives instant professional feedback. This idea led to LinguaBot AI: a voice-to-voice English speaking coach powered by Gemini.
What it does
LinguaBot lets users speak naturally through their microphone and have a real-time spoken conversation with an AI. After the conversation, the user can generate a performance report that estimates an IELTS-style band score and provides detailed feedback, grammar corrections, and improvement tips.
How I built it
The app is built with Next.js and Tailwind CSS for a fast and clean web interface. The browser Web Speech API converts the user’s voice to text. This transcript and conversation history are sent to the Gemini API, which generates natural conversational replies and spoken-English style responses.
For evaluation, the latest spoken answer is sent to Gemini with an IELTS examiner prompt. Gemini analyzes fluency, grammar, vocabulary, and coherence, and returns a structured report with a band score and suggestions. The AI replies are read aloud using the browser’s text-to-speech so the experience is fully voice-to-voice.
Challenges I faced
The main challenge was making the conversation feel natural and short like real speech, not long written paragraphs. Careful prompt design was needed to make Gemini respond conversationally and also act like an examiner for scoring. Handling microphone input and synchronizing listening, thinking, and speaking in the browser was also tricky.
What I learned
I learned how to design effective prompts for different roles of the same model (coach vs examiner), how to build real-time voice interactions in the browser, and how powerful Gemini is as the core intelligence for both conversation and language assessment.
LinguaBot shows how Gemini can replace the need for a human speaking partner by providing instant, interactive practice and professional-level feedback.
Built With
- browser-speechsynthesis-api
- google-ai-studio
- google-gemini-api
- next.js
- react
- tailwind-css
- typescript
- web-speech-api-(speechrecognition)
Log in or sign up for Devpost to join the conversation.