Topic Modeling Book Descriptions

Inspiration

We wanted to use a machine learning algorithm that we can deploy using bentoML. Because of our interest in books, we wanted to somehow classify books into genres/topics through machine learning. Due to the lack of available entire corpora of books, we decided to do topic modeling on book descriptions!

What it does

Our project does topic modeling on a book descriptions corpus using scikit learn library's LDA model. We managed to create an API endpoint for the topic model using bentoML. The API takes a query description and returns the frequency of 15 topics from the best possible model we could create within the given timeframe.

How we built it

We used python's sklearn and nltk library to create the model itself and bentoML to create an API endpoint. We were in the process of creating a user-friendly react frontend to do an API request and show the result of LDA.

Challenges we ran into

We ran into difficulties with bentoML deployment in Heroku due to a lack of example LDA deployments. We weren't able to solve a "cors" issue which halted deployment in Heroku and frontend work.

Accomplishments that we're proud of

We have a working LDA model that is somewhat accurate in predicting the topic/genre of given descriptions! We also got a working API endpoint using bentoML.