Inspiration
As a university student myself, I often found it difficult to navigate through the multitude of courses offered at my university and choose the ones that aligned with my interests. I noticed that many of my peers struggled with the same issue and were not aware of the various courses that could have enriched their academic experience. Currently, students are stuck using either keyword searches or scrolling through pages of courses on our university website. This inspired me to create a solution that would help students easily find courses that discuss topics they are interested in, thus making the process of course selection more efficient and enjoyable. With the advent of natural language processing techniques and machine learning models, I saw an opportunity to apply these technologies to the problem of course recommendation and help students optimize their academic journey.
What it does
My project aims to help university students find courses that match their interests by leveraging natural language processing and machine learning techniques. The system creates sentence embeddings for all course descriptions at UIUC and compares them with the embeddings generated from user queries. By using cosine similarity to identify the closest matches, the system recommends relevant courses to the user. The platform provides personalized course recommendations that enable students to easily navigate the vast array of courses offered by their university and make informed decisions that align with their interests. With my solution, students can optimize their academic journey and get the most out of their university experience.
Using my innovation, students can now describe a course they would like to take and be able to find it with ease.
How I built it
To build this project, I utilized datasets provided by Professor Wade to collect information on all courses offered at the University of Illinois at Urbana-Champaign, including their descriptions and average GPAs. After cleaning the data and transforming it into JSON format, I used a sentence embedding model from HuggingFace to generate embeddings for each course description. These embeddings were then stored in a vector database using Weaviate. I also designed the front-end of the platform, taking inspiration from Neo Brutalist design principles and drawing on the visual aesthetics of popular websites like Figma.com and Gumroad.com. The result is a powerful tool that leverages advanced natural language processing and machine learning techniques to provide personalized course recommendations to university students.
Challenges I ran into
During the course of this project, one of the primary challenges I faced was with the sentence embedding model I initially used. The OpenAI text2vec model, which was available as a module in Weaviate, was providing poor performance when it came to matching user queries with course descriptions. I experimented with various models offered by different providers, including Cohere and HuggingFace, in order to find one that would better meet our needs. Eventually, I settled on the all-roberta-large-v1 model from HuggingFace, which provided significantly improved performance and accuracy. This challenge required persistence and a willingness to try different solutions until I found one that worked well for our project.
Accomplishments that I am proud of
I am incredibly proud of several accomplishments I achieved during this project. First and foremost, I was able to successfully integrate a vector database using Weaviate, a technology that was new to me. This allowed me to store and query the sentence embeddings for all course descriptions in a highly efficient and scalable way, which was essential for the performance of my platform. Additionally, I was able to experiment with a variety of semantic embedding models from different providers, rather than becoming stuck with an initially suboptimal one. This allowed me to identify the all-roberta-large-v1 model from HuggingFace as the best fit for my needs and ultimately achieve a much higher level of accuracy and performance in my course recommendation engine. These accomplishments reflect the benefit of exploring new technologies and experimenting with different approaches until I find the best solutions for my project.
What I learned
Throughout the course of this project, I learned a great deal about the potential of semantic embeddings to revolutionize the field of search. By using advanced natural language processing techniques to represent text data as dense, high-dimensional vectors, I was able to create a powerful course recommendation engine that can match user queries to relevant course descriptions with incredible accuracy. This technique has broad implications for a wide range of applications beyond just education, including e-commerce, content discovery, and more. I believe that by continuing to explore and refine these methods, we can greatly improve the way users search for and discover the things they are interested in, opening up new possibilities for personalized and relevant experiences across a variety of domains.
What's next for That's a Course?!
One potential area for improvement in my project is the quality of the course descriptions themselves. Currently, many of the descriptions provided by UIUC are relatively sparse and lacking in detail, which can make it more difficult to accurately match user queries to relevant courses. To address this, one possible solution would be to leverage a tool like GPT or other LLMs to generate more detailed and informative descriptions for each course. This would involve feeding existing course descriptions into the model and using it to generate more fleshed-out descriptions that better capture the content and scope of each course. However, it would be important to ensure that the generated descriptions are accurate and don't inadvertently introduce new concepts or topics that aren't actually covered in the course, which could potentially mislead users. By improving the quality and specificity of our course descriptions, I could further enhance the accuracy and relevance of our platform's search results, providing even greater value to users.
And depending on the success of That's a Course?! I might also build build That's a RSO?!
Built With
- huggingface
- javascript
- python
- weaviate
Log in or sign up for Devpost to join the conversation.