Lucid

Inspiration

Our friend Alejandro was watching videos on the Putnam math competition while querying ChatGPT in another window, and he had to manually feed the model context for every conversation. Additionally, he later found that some of the questions he was asking were already answered in the video's comments section.

We decided that a collaborative and interactive way to solve this problem could be really well received. This could help in any asynchronous learning activity, especially for underserved students without one-on-one help.

Problem & Solution

As students and educators, we have noticed that it can be especially challenging to learn in an asynchronous environment. Due to the offline nature of asynchronous learning, it can be hard to get help if something is unclear in a lecture, leading to frustration, unproductivity, and distraction. There is also an implicit bias towards English in the field of education, especially for international students studying in the United States.

To solve this we sought to develop a web app that provides those missing interactions for students. Students are able to watch the same videos together and they can annotate the video with comments and questions they have. This is a form of active learning, which is designed to be engaging for students. Our AI assistant uses context from transcripts to assist students if they have a question, and supports over 15+ languages to provide a more equitable solution for all students.

To minimize keystrokes, we have a voice mode that allows students to input questions via voice, as our research in education has shown that students who directly interact with open phones and computers are more likely to be distracted versus their peers that are interacting with the system indirectly. All in all, Lucid provides a way for students to collaborate, learn in a guided way, and to experience this all in the language of their choice.

What it does

The primary user flow is centered around a user pasting a YouTube link into Lucid, which provides them with a UX that allows them to watch the video alongside an intelligent assistant (powered by GPT-4). The assistant can learn from the video transcript to assist the user.

Provides an AI assistant for students learning from long-form videos (lectures, other educational content, etc.)
Allows students to view their peers' questions to AI assistants, giving them helpful context.
Accessibility features to support as many students learning as possible such as translations, and multi-lingual support

How we built it

Next.js and React for our frameworks, Firebase for persistent storage.
APIs: YouTube API, OpenAI API (GPT-4, Whisper for voice recognition, Eleven Labs for text-to-speech)

When a new question is asked, it is routed to our server, which calls the GPT-4 API (and Whisper if the question was asked via voice). The text answer is then streamed to the client. The question and answer are also saved in Firebase, under that video object.

Challenges we ran into

Streaming GPT-4 responses (asynchronous JavaScript generators, EventSource, SSE as a substitute for WebSockets)
Youtube API headaches and deprecated versions
Time crunch (more interesting things we still want to do with this)

Accomplishments that we're proud of

Language model uses video transcript as context for answers
Lucid supports 15+ languages for the entire user flow
Speech to text and text to speech models working effectively
Persistent database for question answers to stay

What we learned

Streaming generated text is a massive UX upgrade
Prompt engineering and its pitfalls
Next.js as the backbone of our app

What's next for Lucid

Though we feel we made a strong proof-of-concept, there are several features that could be added to make Lucid an enterprise-grade product.

From a professor's point of view:

Mark incorrect AI-generated answers as incorrect
Fine-tuned models, adjusted model parameters, and custom instructions

From a user's point of view:

Choose a time segment to ask a question about, not just a point in time
Option to keep questions/answers private

Who we are

Our multidisciplinary curricula and background have given us great expertise to draw on during this project. In addition to experience in computer science, we are:

2 Mathematics Students
1 Econ Student
2 teaching assistants

Our experiences whether it be teaching or though our courses have shown us the pain points in education and learning. We want to improve the educational process just not for ourselves, but for others.

Why who we are matters

We know the challenges of teaching and learning as teaching assistants and students
We know what can be done to improve the learning experience
Our math backgrounds have helped us understand ML theory
Economics background has helped with non-STEM learning.

Built With

Submitted to

University of Florida Gator Hackathon Discord https://discord.gg/5935yp7ety

Created by

I worked on everything from development of features such as translation and frontend to making sure what we made aligned with what I learned as a TA at UF and a student who has studied computing education. As for major I am a graduate computing student.

Justin Ho
I am a Math and Computer Science student.

I worked on the text to speech, speech to text, transcription support and UI components. I also brought in what I've learned as a TA over the last two and a half years to help improve the design.

Benny Cortese
I am a Math and Computer Science student.

I worked on the YouTube embed, custom progress bar, question & answer UI, and data format.

Jim Su
I am an Economics and Computer Science student.

I worked on connecting to GPT-4, livestreaming its output, and displaying questions and responses in "threaded" form for a given timestamp.

Shehzad Shah