Inspiration
My Google Drive holds all of my research materials, and I wanted a way to automatically retrieve context from them and write back new findings. By integrating Model Context Protocol (MCP) tools with sources like Google Drive, arXiv, PubMed, and Google Scholar, this process becomes streamlined and automated.
What It Does
This MCP server enables AI clients and language models to:
- Search scholarly papers across arXiv, PubMed, Semantic Scholar, and Google Scholar
- Retrieve full text and metadata from papers
- Extract relationships between concepts, compare papers, and analyze metadata
- Integrate with Google Drive for document storage and context retrieval
How We Built It
We built API connectors for arXiv, PubMed (via E-utilities), Semantic Scholar, and Google Scholar. Google Drive integration was implemented using the Google Drive API and OAuth2 for authentication. LLM-based tools (powered by Claude) handle summarization, comparison, and relation extraction. We used MCP Inspector for interactive testing and debugging.
Challenges We Faced
- Google Scholar lacks an official API and is rate-limited
- Google Drive integration required complex OAuth2 setup
- PDF parsing needed fallback logic due to inconsistent formatting
- Normalizing metadata across different sources was non-trivial
What We Learned
We learned how to build robust MCP-compatible workflows and navigate the challenges of scholarly content aggregation. For example, PubMed does not host full-text PDFs, which required adaptation in our retrieval pipeline.
What's Next for MCP Research
- Integrating semantic vector search over full papers and generated summaries
- Building a research assistant that automates literature reviews and proposal generation
- Slack and Notion integration for collaborative workflows
- Frontend visualization of citation networks and concept graphs
Log in or sign up for Devpost to join the conversation.