Movie Recommendation Bot
Inspiration
As a film lover, I often found myself endlessly scrolling through Netflix trying to decide what to watch. I wanted to solve this problem by building a recommendation engine that gives meaningful, personalized suggestions.
Originally, I set out to fine-tune a large language model (LLM) for movie recommendations based on user preferences. However, due to lack of access to a GPU, I had to get creative. This led me to explore classical machine learning techniques, which turned out to be surprisingly powerful for the task.
What it does
The Movie Recommendation Bot allows a user to:
- Input a movie title or description
- Receive the ** most similar movies** based on their descriptions
- Get quick, relevant recommendations without needing an account or rating history
It uses TF-IDF vectorization of movie descriptions and cosine similarity to determine which titles are closest to the user's input.
How we built it
- Dataset: A cleaned version of the Netflix Movies and TV Shows dataset
- Tech stack:
- Python
- Pandas & Numpy
- Scikit-learn (
TfidfVectorizer,cosine_similarity)
- Process:
- Preprocessed the movie descriptions (null values, lowercase, stopwords)
- Converted descriptions into numerical vectors using TF-IDF
- Calculated pairwise cosine similarity
- Returned the top 5 similar titles to the input movie
Challenges we ran into
- Lack of GPU resources meant I had to abandon the original LLM-based idea
- Ensuring the text preprocessing pipeline was clean and consistent
- Dealing with missing or minimal descriptions that impacted recommendation quality
- Handling user typos or variations in movie title input (future improvement)
Accomplishments that we're proud of
- Completed a fully functional recommendation engine in under 48 hours
- Learned and implemented NLP techniques like TF-IDF and cosine similarity from scratch
- Successfully pivoted from a heavy LLM solution to a fast and efficient lightweight model
- Wrote modular, readable code that can be extended into a full-stack app later
What we learned
- How to apply vector space models for content-based filtering
- Importance of good preprocessing in natural language applications
- How to make trade-offs when resources are limited (LLM vs traditional ML)
- Basics of recommendation systems and the practical use of similarity metrics
What's next for Movie Recommendation Bot
- Improve the UI
- Eventually return to the original idea: LLM-powered recommendations, once GPU resources are available
- Switch from a static dataset to an actual API that retrieves data
Built With
- kaggle
- nltk
- pandas
- plotly
- python
- scikit-learn
- streamlit
Log in or sign up for Devpost to join the conversation.