Inspiration

CitaMind came from recent articles showing that many recently published papers at top Machine Learning conferences like NeurIPS 2025 had "hallucinated citations" — references to papers that simply don't exist, likely generated by AI without the author realizing. GPTZero's investigation found over 100 phantom citations across 51 NeurIPS papers, each missed by 3-5 peer reviewers. 55% of ChatGPT-3.5 citations are fabricated (Walters & Wilder, 2023). The US government's MAHA health report contained 19+ phantom citations. If the world's best researchers and peer reviewers can't catch this, we need automated tools that can.

Initial Disclosure

At the start of the hackathon, I was unaware of GPTZero's Hallucination Check tool, which addresses a similar problem. Our positioning evolved to focus on what makes CitaMind different: (1) a search-first UX where users find any paper from any venue without uploading a PDF, (2) a community-powered Phantom Registry that grows smarter with every scan, (3) real-time streaming verification with inline text highlighting, and (4) structured AI research summaries with gap analysis — features GPTZero does not offer.

What It Does

Users create an account, then search for papers across Nature, arXiv, ACM, PubMed, IEEE, Springer, and thousands more venues. They select a paper and it is analyzed in seconds. The system provides:

  1. Phantom Citation Detection — identifies references to papers that do not exist in any academic database, verified across Semantic Scholar, OpenAlex, and CrossRef with Claude-powered confirmation.

  2. Miscited & Retracted Detection — flags papers used to support claims not found in the cited work, and papers that have been officially retracted by their publisher.

  3. AI Writing Score — calculates how much of the paper was likely written with AI, with per-section breakdown and flagged passages.

  4. Research Summary with Gap Analysis — users can generate and listen to (via ElevenLabs) a structured AI summary covering key findings, methodology, contributions, and gaps/limitations in the research.

  5. Community Phantom Registry — when users save a paper to their library (stored in HarperDB), all detected phantom citations automatically feed into a shared, open-source database. An author leaderboard, venue breakdown chart, and year-over-year trend graph visualize the data. The system grows smarter as more papers are scanned.

How We Built It

Frontend: Next.js 14 (App Router) + TypeScript + Tailwind CSS + shadcn/ui, with a retro box-style UI. Developed entirely in Zed.

Paper Search & Data:

  • OpenAlex API — primary search engine with 16,000+ papers pre-cached across CHI, NeurIPS, ACL, EMNLP, UIST, CVPR, and 30+ venues
  • Semantic Scholar API — reference fetching with three-tier fallback (DOI → ARXIV prefix → title search)
  • CrossRef API — citation cross-verification with strong title matching (Jaccard + Levenshtein + containment)

Paper Rendering:

  • ar5iv (ar5iv.labs.arxiv.org) — renders any arxiv paper as clean HTML with proper LaTeX math via MathML
  • Citation numbers in the text are clickable pills that scroll to and highlight the corresponding reference

Verification Pipeline (SSE streaming):

S2 References → Title Quality Filter → Claude Haiku Title Check → Multi-Signal Scoring (title existence + author overlap + placeholder detection) → Claude Batch Confirmation → Final Classification

  • References with S2 paperId are pre-verified
  • Unverified refs checked against OpenAlex and CrossRef with author overlap verification
  • Placeholder author detection catches "John Doe", "Jane Smith", "Firstname Lastname"
  • Claude confirms borderline cases before marking phantom

AI Detection: Claude Sonnet analyzes each section for AI writing patterns with conservative scoring (0-100 scale, most papers score 0-5%).

Research Summary: Claude Haiku generates structured summaries (TL;DR, key findings, methodology, contributions, gaps). ElevenLabs reads the summary aloud with pause/resume support.

Database: HarperDB stores users, paper library, and the global Phantom Registry.

Data Visualization: Recharts for venue breakdown donut chart and year-over-year phantom trends. Author leaderboard ranks researchers by phantom citation count.

Challenges

  • Semantic Scholar rate limits — 100 requests per 5 minutes unauthenticated. Solved with three-tier fallback (DOI → ARXIV → title search) and aggressive caching.
  • False phantom detections — body text fragments appearing as citations from S2's API. Solved with regex title filters + Claude Haiku batch check.
  • OpenAlex reference undercounting — many papers show 0 references in OpenAlex. Solved by always using Semantic Scholar for references.
  • ar5iv theorem box sizing — oversized empty blocks from inline styles and SVGs. Solved by stripping all inline styles and SVGs from the HTML.
  • Author-swap hallucinations — citations using real paper titles with fabricated authors. Solved with author overlap verification during the multi-source check.

What We Learned

  • ar5iv is an incredible resource — any arxiv paper instantly becomes clean, parseable HTML with proper math rendering
  • Academic citation hallucination is a much harder problem than "does this paper exist?" — 25% of hallucinations use real paper titles with fabricated metadata
  • Community databases create network effects — every scan makes the phantom registry more valuable for everyone
  • Conservative classification is critical — a single false positive destroys credibility in a demo

What's Next

  • Author-level verification for S2-resolved references (comparing cited authors against database authors)
  • Claim verification using Claude to check if cited papers actually support the claims made
  • Browser extension for inline verification while reading papers on arxiv/ACM/IEEE
  • Institutional dashboards for journals and conference program committees

Built With

  • anthropic-claude-api
  • ar5iv
  • crossref-api
  • d3.js
  • elevenlabs-api
  • harperdb
  • katex
  • next.js
  • openalex-api
  • react
  • recharts
  • semantic-scholar-api
  • shadcn/ui
  • tailwind-css
  • typescript
  • vercel
  • zed
Share this project:

Updates