Inspiration

Even within a single research area, the volume of new arXiv submissions makes it hard to keep up. Our team kept bookmarking papers in group chats, only to lose track of who read what. We wanted a lightweight companion that could scan the latest uploads, surface the highlights, and package them into a briefing that teammates could skim between hackathon shifts.

What it does

PaperLink turns a topic prompt into a polished research digest in seconds. The FastAPI backend pulls the latest arXiv papers, builds MiniLM embeddings, clusters related work, and uses Claude 4.5 Haiku via Lava for clean summaries and labels. The Next.js 14 frontend offers “Browse” for quick paper cards and “Digest” for daily, weekly, or monthly briefings. Highlight any sentence to hear it instantly with Fish Audio TTS, all backed by cached Chroma Cloud collections.

How we built it

We designed the workflow with mentors, prototyped with AI copilots, and built FastAPI endpoints for arXiv ingestion, Chroma caching, KMeans clustering, and prompt engineering. The Next.js frontend handles search, digest rendering, and TTS integration with Tailwind polish. Everything is cached with stable topic IDs, allowing instant reloads and offline use.

Challenges we ran into

Handling arXiv rate limits, batching prompts, and keeping JSON outputs stable were tough. Integrating FastAPI, Chroma, Lava, and Fish Audio at once stretched our time and focus. Cache management, URL sync bugs, and Wi-Fi-fragile audio streaming also tested our debugging patience.

Accomplishments that we’re proud of

We built an end-to-end system that goes from topic to digest in under a minute. Cached embeddings made re-runs nearly instant. Seeing highlight-to-speech work live was a big win, and Chroma Cloud let teammates share digests effortlessly.

What we learned

Prompt design and JSON structure directly affect output quality. Vector caching saves huge time and compute. Clean Git habits, small commits, and early UX feedback are invaluable lessons for real-time research tools.

What's next for PaperLink

  • Productionize the Fish Audio integration with configurable voices and offline caching so listening works on mobile.
  • Add team workspaces with shared topics, email/Slack digests, and “follow” alerts for emerging clusters.
  • Expand beyond arXiv by tapping publisher APIs and open datasets, with per-track prompt tuning.
  • Layer in analytics, think trendlines for embedding clusters or “papers similar to what your lab cited last week.”
  • Ship a Chrome extension so you can save an interesting abstract in one click and have it show up in your next briefing.

Built With

Share this project:

Updates