This image is currently processing. Please be patient...
What it does
We obtain a plot. The plot is preprocessed to tokenize words, remove punctuation, stop words and words not in our dictionary, and make all words lower case. Each word is replaced with its index in the vocabulary. It is then fed to the neural network which is trained using a data set of plots and genres to predict the genre of the plot.
How we built it
For the natural language processing we imported the natural language toolkit which had stop words and the function word_tokenize. For our neural network we used tensorflow and keras to make a neural network with 3 layers.
Challenges we ran into
The genres in our movie data set were not evenly distributed; there were more action movies than any other genre and thus the neural network primarily predicts action movies.
Accomplishments that we're proud of
We built a neural network that uses natural language processing in 10 hours! Pretty cool :)
What we learned
We learned about natural language processing and working with neural networks.
What's next for IMDweeB
What's next could be using a larger data set with a better distribution of genres to more accurately predict a plot's genre