Keyword Context Tracking with NLP

Inspiration

As students, we find ourselves reading a lot of scholarly articles. To streamline this process, we use the power of NLP to help us search for meaning and context.

What it does

It takes keywords in and uses semantic search to find relevant paragraphs in meaning and context.

How we built it

We use Cohere's API to integrate NLP to train a model and compare similar text in meaning. We also used React JS for the frontend demo and Flask as the backend to connect the frontend to the language processing code.

Challenges we ran into / What we learned

A large issue we faced in the creation of this project was the parsing of the pdf. We needed the text to be parsed by paragraphs while also keeping track of the page number. The process was difficult because splitting the text by new lines ("\n") or double new lines ("\n\n") did not work on every PDF we tried. In other words, different PDF's have different formatting. Moving forward, we would like to create a parsing tool that works on more PDF's, AKA more generality.