Reading is somewhat of a chore for most of the people in our group. Sometimes it can be fun but textbooks and journals are dense and complex. We wanted to make reading dense things easier.

What it does

Our website DePedantify takes a body of text, a pdf, or an image of a pdf, and processes it, extracting text, and then meaning. Any single highlighted word will conveniently bring up the oxford dictionary definition for easy lookup, and any important or industry specific things in your sentences will be parsed our by Google's NLP engine and associated with a helpful wikipedia page, the summary of which is also put into the page for your convenience. Finally, you can comment on sentences in your documents for other users to see so you can explain anything that may be helpful. Other people may rate your annotations for helpfulness. Finally, you may select favorite texts to be saved so they can be viewed again without having to be processed.

How we built it

We used tesseract to process the PDF's and extract their text. We used Google Cloud Morphology analyzer to tokenize the text and extract things where wikipedia would provide good context. We used SQLAlchemy and a flask server to host our site, and PURE css, html, and javascript to program the frontend. This is where the bulk of our features reside.

Challenges we ran into

Integrating all of these API's was definitely a challenge, and none of us are that experienced with front end web development.

Accomplishments that we're proud of

despite our difficulties we incorporated almost all of the features we set out to produce, which we thought ambitious at first. Given more time there's still lot's we'd like to do but we are definitely happy with the product we were able to create.

What we learned

We learned a lot about tools we were unfamiliar with including database construction, front end development and styling, and using the google cloud platform!

What's next for Depedantify

There are a lot of features we'd like to add and a lot of room for optimizing, cleaning up what we have. This is actually a tool we all want to exist so it's not out of the realm of possibility that we would continue this project

