Inspiration
LearnStream AI is inspired by the evolving landscape of university education in the AI-driven, post-COVID era. Recognizing the challenges of maintaining student engagement in diverse, complex subjects and long class durations, our platform seeks to leverage AI to enhance learning efficiency. We're motivated by the need to bridge language barriers for international students and provide real-time feedback mechanisms for teachers, ensuring a more responsive and inclusive educational environment.
What it does
LearnStream AI offers real-time course summaries and specialized terminology to aid comprehension. It provides instant feedback to teachers, enabling dynamic course adjustments. By eliminating language barriers, our platform ensures equal learning opportunities for all students, enhancing overall classroom interaction and understanding in a technology-integrated educational setting.
How we built it
- Input and Configuration: The teacher's audio is directly captured via a microphone. The system includes configurations for both speech and audio, presumably to set parameters for processing.
- Transmission: Audio is then streamed to our server and then to Azure AI using a WebSocket connection.
- Speech Recognition: Azure congitive AI component processes the audio stream to produce live transcription results. This component might be using speech-to-text technology to convert spoken words into written text in real time.
- Output: The live transcription result is then sent to both the teacher's and the student's browsers via a WebSocket connection. For the student, there are two outputs provided: the live transcript and a summary. The summarization process is indicated, though details on how it operates are not shown.
- Real-Time Interaction: The system allows for real-time interaction and feedback, which means as the teacher speaks, the speech is almost instantly transcribed and summarized for the student's benefit.
- All the services are deployed on the Azure app service
Challenges we ran into
Our main challenge centers around real-time translation and speech synthesis. We need to decide the optimal moment for speech synthesis—whether after completing a sentence or during simultaneous translation. This process is complicated by the need to modify content during synthesis, particularly in simultaneous interpretation. Additionally, the system's performance varies with the dataset, showing better results with uniform sentence lengths, as translation and synthesis proceed together. Moreover, dramatic fluctuations in voice output are a concern, especially when short sentences are followed by longer ones.
Audio Format Limitations: The system only supports WAV format with a fixed 16 kHz sample rate, and does not support other formats like Mac's m4a. This requires extra preprocessing.
Accomplishments that we're proud of
We take immense pride in our accomplishments: Our real-time translation technology shatters language barriers, enabling non-native speakers to fully engage with course materials in their own languages. Our system's ability to provide brief summaries has proven crucial for students with attention challenges or those confronted with dense material, ensuring critical information is retained. Additionally, our platform's real-time feedback mechanism creates a more interactive and participative classroom experience, facilitating instant clarification and promoting active learning. Collectively, these innovations underscore our dedication to enhancing education through technology, creating a more adaptable, inclusive, and personalized learning landscape for students navigating the complexities of diverse academic subjects.
What we learned
Through our journey, we've gained a deeper understanding of the challenges faced by students. We've learned the intricacies of asynchronous processing and the various factors that must be considered to ensure smooth operation. Additionally, we've become proficient in utilizing SDKs in conjunction with WebSocket technology to create a more interactive and responsive educational platform. These insights have been invaluable in enhancing our service and will guide our future developments.
What's next for LearnStream
In the next stage, we are introducing 'Smart Textbook Search', a feature that employs fuzzy logic to enhance resource discovery, enabling learners to find relevant material efficiently. Additionally, we plan to expand course to a online platform, making education more accessible to a wider audience. For a truly customized learning journey, our 'Enhanced Personalization' system will recommend tailored courses and career paths based on individual learning trajectories, supporting learners in achieving their unique educational and professional goals.
Built With
- azurea-app-service
- azureai
- cognitive-ai
- cosmo-db
- css
- html
- javascript
- python
Log in or sign up for Devpost to join the conversation.