Inspiration
Our friend Alejandro was watching videos on the Putnam math competition while querying ChatGPT in another window, and he had to manually feed the model context for every conversation. Additionally, he later found that some of the questions he was asking were already answered in the video's comments section.
We decided that a collaborative and interactive way to solve this problem could be really well received. This could help in any asynchronous learning activity, especially for underserved students without one-on-one help.
Problem & Solution
As students and educators, we have noticed that it can be especially challenging to learn in an asynchronous environment. Due to the offline nature of asynchronous learning, it can be hard to get help if something is unclear in a lecture, leading to frustration, unproductivity, and distraction. There is also an implicit bias towards English in the field of education, especially for international students studying in the United States.
To solve this we sought to develop a web app that provides those missing interactions for students. Students are able to watch the same videos together and they can annotate the video with comments and questions they have. This is a form of active learning, which is designed to be engaging for students. Our AI assistant uses context from transcripts to assist students if they have a question, and supports over 15+ languages to provide a more equitable solution for all students.
To minimize keystrokes, we have a voice mode that allows students to input questions via voice, as our research in education has shown that students who directly interact with open phones and computers are more likely to be distracted versus their peers that are interacting with the system indirectly. All in all, Lucid provides a way for students to collaborate, learn in a guided way, and to experience this all in the language of their choice.
What it does
The primary user flow is centered around a user pasting a YouTube link into Lucid, which provides them with a UX that allows them to watch the video alongside an intelligent assistant (powered by GPT-4). The assistant can learn from the video transcript to assist the user.
- Provides an AI assistant for students learning from long-form videos (lectures, other educational content, etc.)
- Allows students to view their peers' questions to AI assistants, giving them helpful context.
- Accessibility features to support as many students learning as possible such as translations, and multi-lingual support
How we built it
- Next.js and React for our frameworks, Firebase for persistent storage.
- APIs: YouTube API, OpenAI API (GPT-4, Whisper for voice recognition, Eleven Labs for text-to-speech)
When a new question is asked, it is routed to our server, which calls the GPT-4 API (and Whisper if the question was asked via voice). The text answer is then streamed to the client. The question and answer are also saved in Firebase, under that video object.
Challenges we ran into
- Streaming GPT-4 responses (asynchronous JavaScript generators, EventSource, SSE as a substitute for WebSockets)
- Youtube API headaches and deprecated versions
- Time crunch (more interesting things we still want to do with this)
Accomplishments that we're proud of
- Language model uses video transcript as context for answers
- Lucid supports 15+ languages for the entire user flow
- Speech to text and text to speech models working effectively
- Persistent database for question answers to stay
What we learned
- Streaming generated text is a massive UX upgrade
- Prompt engineering and its pitfalls
- Next.js as the backbone of our app
What's next for Lucid
Though we feel we made a strong proof-of-concept, there are several features that could be added to make Lucid an enterprise-grade product.
From a professor's point of view:
- Mark incorrect AI-generated answers as incorrect
- Fine-tuned models, adjusted model parameters, and custom instructions
From a user's point of view:
- Choose a time segment to ask a question about, not just a point in time
- Option to keep questions/answers private
Who we are
Our multidisciplinary curricula and background have given us great expertise to draw on during this project. In addition to experience in computer science, we are:
- 2 Mathematics Students
- 1 Econ Student
- 2 teaching assistants
Our experiences whether it be teaching or though our courses have shown us the pain points in education and learning. We want to improve the educational process just not for ourselves, but for others.
Why who we are matters
- We know the challenges of teaching and learning as teaching assistants and students
- We know what can be done to improve the learning experience
- Our math backgrounds have helped us understand ML theory
- Economics background has helped with non-STEM learning.
Log in or sign up for Devpost to join the conversation.