Inspiration
One year ago, I moved from China to Canada, and this journey completely changed my perspective on language learning. Despite years of studying English in school, I found myself struggling to communicate fluently in my daily work life. The gap between textbook English and real-world professional communication was overwhelming. I tried countless language learning apps and methods—flashcard apps, grammar courses, conversation partners, and podcast listening—but none of them addressed my actual needs. Most apps focused on beginner vocabulary or tourist phrases, not the nuanced, professional English I needed to succeed at work. I couldn't find a tool that helped me understand my Canadian colleagues in meetings, practice my pronunciation with real feedback, or learn from the content I actually encountered in my professional life. Out of frustration and necessity, I decided to build my own solution. AI Lingua Flow was born from this personal struggle—a tool designed by someone who truly understands the challenges faced by Chinese professionals adapting to an English-speaking work environment. By leveraging the power of Google Gemini AI, I created an app that provides the personalized, context-aware learning experience I wished I had when I first arrived in Canada.
What it does
AI Lingua Flow is a comprehensive Chinese-to-English learning app that leverages Google Gemini AI to provide an immersive learning experience: Text & Image Study: Learn from any text or image content with AI-assisted comprehension Video Study: Import YouTube videos and learn from real-world English content with AI-generated analysis Shadowing Practice: Practice pronunciation with AI-powered speech recognition and real-time feedback Smart Vocabulary: Build your personal vocabulary with spaced repetition and dictation exercises Assessment: Track your progress with AI-driven skill evaluations and personalized recommendations
How we built it
We built AI Lingua Flow using Flutter 3.x with Dart SDK ^3.10.7, enabling true cross-platform support for Android, iOS, macOS, Windows, Linux, and Web from a single codebase. Core Architecture:
- State Management: Riverpod 2.x for reactive, testable state handling
- Dependency Injection: GetIt + Injectable for clean, modular code organization
- Navigation: GoRouter for declarative, type-safe routing AI Integration:
- Google Gemini API (via Dio REST client) powers our intelligent features—content analysis, pronunciation feedback, and high-quality text-to-speech Data & Storage:
- SQLite (sqflite) for local vocabulary and learning progress
- Flutter Secure Storage for safely storing API keys Media Capabilities:
- youtube_explode_dart for YouTube video content extraction
- video_player + chewie for smooth video playback
- speech_to_text for real-time pronunciation recognition
- flutter_tts + custom Gemini TTS for natural voice synthesis
- just_audio + record for audio playback and recording Development Workflow:
- Run
flutter pub getto install dependencies - Generate code with
flutter pub run build_runner build --delete-conflicting-outputs - Run on any platform with
flutter run
Challenges we ran into
- Text-to-Speech Quality Early in development, we used an older TTS API version that produced robotic, unnatural-sounding speech. For a language learning app, pronunciation quality is everything—learners need to hear authentic, natural English to develop proper listening skills and speaking habits. The mechanical voice made us seriously question whether we could meet the standards required for an educational app. After extensive research, we integrated Gemini AI's advanced TTS capabilities, which finally delivered the natural, human-like voice quality essential for effective language learning.
- YouTube Content Copyright Considerations Extracting content from YouTube videos presents inherent copyright risks. While our app processes videos for educational purposes, the legal implications of content extraction remain a gray area. For this demo version, we focused on building the technical functionality, but we acknowledge that a production release would require careful consideration of content licensing, fair use policies, and potentially partnerships with content creators or platforms to ensure full legal compliance.
- Offline-First Architecture This was our biggest engineering investment, crucial for two reasons: Token consumption: Caching AI responses locally dramatically reduces API costs User experience: Cached content loads instantly—no waiting for real-time generation. And when network issues occur, the app remains fully functional instead of becoming unusable
Accomplishments that we're proud of
- Seamless AI integration: Our Gemini AI integration provides natural, context-aware responses that genuinely help learners improve.
- Beautiful, intuitive UI: We crafted a premium user experience with smooth animations and modern design.
- Comprehensive learning ecosystem: We built not just one feature, but an integrated learning platform with 10+ interconnected modules.
- True cross-platform support: The app runs natively on all major platforms from a single codebase.
- Smart spaced repetition: Our vocabulary system uses proven learning science to maximize retention.
What we learned
- AI prompt engineering is crucial: The quality of Gemini AI responses depends heavily on how we structure our prompts.
- User experience matters in learning apps: Small friction points can derail a learner's motivation; every interaction must feel smooth.
- Flutter's power and limitations: While Flutter enabled rapid cross-platform development, platform-specific audio features required native code bridges.
- Language learning is deeply personal: Users have vastly different learning styles, and flexibility in the app design is essential.
What's next for AI Lingua Flow
- Conversation Mode: Add AI-powered conversation practice where users can have natural dialogues with Gemini AI
- More Language Pairs: Expand beyond Chinese-English to support other language combinations
- Advanced Analytics: Provide deeper insights into learning patterns and personalized study plans
- Podcast & Audiobook Integration: Support more audio content sources for immersive listening practice
- Social Features: Enable learners to practice together, share vocabulary lists, and compete on leaderboards
- Offline Gemini: Explore on-device AI models for offline pronunciation assessment
Log in or sign up for Devpost to join the conversation.