Got inspired by this article to look for quantitative trends within books. Specifically, I wanted to create a tool that allowed a user to query for a theme and see how it progresses throughout the course of a book.

What it does

Given a search term and a book, the program traverses the book paragraph by paragraph and gives each paragraph a weighting for the score of the theme in that paragraph. The results are then visualized in a line graph.

How I built it

I used python to traverse the book, Calibre software to convert epub files to .txt, Twinword to find weighted associations for the query term, and plotly to visualize the results

Challenges I ran into

Connecting all the parts: getting the book, the search term, and visualizing into the ploy

Accomplishments that I'm proud of

The scoring scheme

What I learned

NLP is a lot more complicated than once thought

What's next for BookQL

Connecting the existing parts together and allowing for comparision between books or multiple query terms to a book

Built With

Share this project: