Inspiration

My Google Drive holds all of my research materials, and I wanted a way to automatically retrieve context from them and write back new findings. By integrating Model Context Protocol (MCP) tools with sources like Google Drive, arXiv, PubMed, and Google Scholar, this process becomes streamlined and automated.

What It Does

This MCP server enables AI clients and language models to:

  • Search scholarly papers across arXiv, PubMed, Semantic Scholar, and Google Scholar
  • Retrieve full text and metadata from papers
  • Extract relationships between concepts, compare papers, and analyze metadata
  • Integrate with Google Drive for document storage and context retrieval

How We Built It

We built API connectors for arXiv, PubMed (via E-utilities), Semantic Scholar, and Google Scholar. Google Drive integration was implemented using the Google Drive API and OAuth2 for authentication. LLM-based tools (powered by Claude) handle summarization, comparison, and relation extraction. We used MCP Inspector for interactive testing and debugging.

Challenges We Faced

  • Google Scholar lacks an official API and is rate-limited
  • Google Drive integration required complex OAuth2 setup
  • PDF parsing needed fallback logic due to inconsistent formatting
  • Normalizing metadata across different sources was non-trivial

What We Learned

We learned how to build robust MCP-compatible workflows and navigate the challenges of scholarly content aggregation. For example, PubMed does not host full-text PDFs, which required adaptation in our retrieval pipeline.

What's Next for MCP Research

  • Integrating semantic vector search over full papers and generated summaries
  • Building a research assistant that automates literature reviews and proposal generation
  • Slack and Notion integration for collaborative workflows
  • Frontend visualization of citation networks and concept graphs

Built With

Share this project:

Updates