Inspiration
The Art Theatre is a historic and cultural landmark in Champaign Urbana and has been serving the community since 1913 . We would like to do our part in making it a better experience for movie goers and help the Art Theatre thrive for more years to come.
What it does
We attempt to classify the style of the movies that the Art theatre shows in comparison to all the other movies from IMDB
How we built it
We used the dataset presented by the Art Theatre which has information on the movie names. We then scraped the IMDB to get more information on these movies such as the genre, the plot summary, the gross revenue it earned, etc. In addition, we got the same information for all the other movies on the website providing some constrains on the selection. We then used Bert to convert text from the plot summary into a high dimensional vector space. This is followed by DBscan for clustering . We used the entire movie dataset from IMDB to cluster based on plot summary and see where the Art theatre movies fall on that phase space. We can use this to suggest future movies to screen.
Challenges we ran into
The IMDB dataset being huge, it took us a lot of time just to explore the dataset. Being limited by time, we put constrains on the IMDB movies we selected such as selecting the ones with a rating > 8 and with the number of voters who gave the rating to > 10000. Given more time, we would like to expand our project.
Accomplishments that we're proud of
Being able to do this in < 10 hours is something we are proud of. We started learning BERT as we go along this project and we are confident we can perform better.
What we learned
BERT, team work
What's next for Art theatre movie suggestions
We would like to include not just suggestions based on plot summaries but we would like to consider other variables such as the preference of the different age groups, trends based on pricing, etc
Built With
- bert
- dbscan
- matplotlib
- natural-language-processing
- pandas
- pca
- python
- scikitlearn
Log in or sign up for Devpost to join the conversation.