Inspiration
The inspiration behind Lang.AI comes from the desire to create a more immersive and adaptive language learning experience. One and a half years ago, I found myself like many others on a Russian language-learning journey. Being around my Russian friends, I found their culture to be so fascinating and intriguing. Initially I began by starting from the basics, downloading Duolingo and loading up the Russian modules. Although initially I learned a good amount, Duolingo and other similar platforms just aren't able to provide that level of personalization and interactivity that a tutor can. With Lang.AI we wanted to improve that. We wanted to develop a solution that could track a user's progress, offer personalized lessons, and provide instant feedback—mimicking the experience of learning with a native speaker. The goal was to create an AI-powered tutor that evolves alongside the learner, improving not only language skills but also confidence in speaking.
What it does
LangAI is a real-time language tutor that adapts to each learner’s unique needs. It tracks a user's strengths and weaknesses, providing personalized lessons based on their progress. The application features interactive conversation practice, real-time pronunciation feedback, and structured lessons. Users can engage in free conversations or follow a predefined learning path. The AI helps learners improve their speaking, listening, and understanding skills in an interactive and engaging way.
How we built it
We built LangAI using a combination of modern technologies:
- Frontend: The web application is powered by Next.js, providing a seamless and responsive user interface.
- Backend: The backend is built using Flask and Python for handling user requests and generating session tokens. It also integrates with LiveKit to facilitate real-time voice communication between the user and the AI.
- AI Integration: We use OpenAI's GPT-4o model to process and understand user speech, offer corrections, and generate appropriate responses.
- Database: We use Neon PostgreSQL to store user data, including progress, strengths, and weaknesses.
- Voice Agent: The voice agent, which listens to the user's speech and provides feedback, is powered by custom Python scripts and uses real-time processing.
Challenges we ran into
- Real-time voice processing: Ensuring smooth communication between the user and the AI was one of the biggest challenges. Integrating real-time voice recognition and response generation while maintaining low latency required fine-tuning both the backend and frontend components.
- Personalization of lessons: Developing a system that tracks the user’s weaknesses and strengths in real-time, while dynamically adjusting lesson difficulty, was a complex task. We needed a robust algorithm to analyze speech and interaction patterns.
- Token management: Handling session tokens securely across multiple servers was a challenge, as we needed to ensure smooth transitions between the Flask-based token generator and the LiveKit integration.
Accomplishments that we're proud of
- Real-time AI interaction: We’re proud of successfully building an AI that can converse in real-time with the user, providing valuable feedback on pronunciation, grammar, and sentence structure.
- Personalized learning path: The ability to track individual user progress and adapt lessons to their unique needs is a significant accomplishment.
- Cross-server integration: Integrating multiple servers (Next.js frontend, Python backend with LiveKit, and Flask server) into a seamless system was a challenging yet rewarding process.
What we learned
- Real-time AI processing: We learned a lot about optimizing real-time AI interactions, including managing API calls, processing voice inputs, and ensuring timely responses.
- Session management and token security: Implementing secure token generation and managing user sessions across different servers gave us deeper insights into session management best practices.
- Voice recognition and feedback: Working with voice recognition systems taught us how to refine the process of converting speech to text, analyzing it for accuracy, and providing real-time corrections. Authentication: We implemented Google Identity Services (GSI) to handle authentication. This system allows users to sign in securely via their Google account, making the login process seamless and secure.
What's next for LangAI
- Expanded language support: In the future, we aim to expand LangAI’s language capabilities to include more languages, dialects, and regional accents, making it accessible to a wider audience.
- Advanced speech analysis: We plan to enhance the pronunciation feedback system, adding features like pitch analysis and fluency scoring.
- More interactive lessons: Adding more gamified elements and interactive learning modules will allow learners to engage with the language in a fun and motivating way.
- Mobile application: We are exploring the possibility of developing a mobile version of LangAI to make it even more accessible and portable for users on the go.
- Report Card System: We plan to integrate a Report Card System where users can track their progress over time. The system will generate personalized notes and detailed report cards based on their performance, highlighting areas for improvement and celebrating achievements.
Built With
- css
- flask
- gsi
- html
- javascript
- livekit
- neon
- nextjs
- node.js
- openai
- postgress
- python
- react
- tailwind
- typescript
- webrtc
- websockets
Log in or sign up for Devpost to join the conversation.