We are three PhD students. Thesis and paper writing is a part of academia, but it kinda sucks. Literature review especially. We built 'paper graph' to make it literature review easier and better.
What it does
paper graph takes in key papers from your literature review. It visually relates the forward and backward citations of these papers. A network is build, showing which papers you have are most important, as well as suggesting papers your colleagues think are important, but you haven't looked at yet.
Scholar search isn't very good. Jargon is common, and key words can mean very different things in different fields. We build paper graph to automate how people already use citation to explore the literature.
How we built it
paper graph is mainly built in python.
We use the Menderly API to get DOI identifiers from the pdf files of papers you have. We search PubMed with these ID numbers, which gives forward and backward citations. We store these citations in a postgresql database.
Beaker is used to visualise the network. Sources from within your existing collection are identified against papers you don't have yet. Clusters show more important works. Nodes of the network can be clicked, which links to the abstract of the article on PubMed.
Challenges we ran into
Data access was hard - Google Scholar locked us out for too many GET calls! Source identifiers are different on different sources - Google Scholar, Menderly, and PubMed. A fair bit of our work was translating between the various ID numbers.
We spent quite a bit of time trying to scrape pdf files themselves for metadata before finding a much easier solution through Menderly API!
Accomplishments that we're proud of
We got it done!
What we learned
Learned a lot about HTTP requests, using Beaker and various APIs (or not APIs).
Spend more time looking for a good API!
What's next for paper graph
We'll use it in our theses, and hopefully develop it further!