Inspiration
Machine learning researchers spend an enormous amount of time navigating papers, skimming abstracts, scanning for datasets, checking evaluation metrics, and trying to determine whether a paper is actually relevant to their work.
Keyword search and abstracts simply aren't enough. We need a way to identify the most relevant parts of a paper based on a users research interest and discover related works more intelligently.
ArXtract was built to solve that.
What it does
Paste an arXiv link. Enter your research interests. Get everything you need.
- Key Sections: Extracts title, problem statement, contribution, architecture, datasets, metrics, baselines, results, limitations, etc. into structured fields.
- Relevance Scoring: By entering your research topic, you get a 0–100 relevance score with the papers abtract and the top 5 most relevant text chunks.
- Related Papers: Discovers and ranks similar papers from arXiv based on your research interest by cosine similarity.
- Research Query: Ask follow-up questions about the paper and get context-constrained answers.
How I built it
ArXtract is built around a multi-stage retrieval pipeline. When a user submits an arXiv link, the backend parses the PDF, removes references, removes symbols, and segments the paper into overlapping chunks.
Both the user’s research query and the paper’s abstract + chunks are converted into embeddings. Cosine similarity is used to compute fine-grained relevance scores. The top candidates are then refined through an LLM for structured extraction and context-constrained question answering.
The frontend is built in React with a terminal-inspired interface, while the backend is powered by FastAPI and deployed via Vercel.
Accomplishments that I'm proud of
This was my first time building frontends with TypeScript and React, and I’m proud that I was able to build a complete functional interface. I especially love the macOS terminal inspired UI. I think it makes the system look focused and really clean.
Built With
- css
- fastapi
- llm
- numpy
- pydantic
- python
- react
- typescript
- vite
Log in or sign up for Devpost to join the conversation.