Inspiration
Every research project starts with a folder of PDFs that just keeps growing. You know the answer is in there somewhere, but Cmd+F fails and Google doesn't know which papers you have. We wanted a tool that reads your papers, answers in plain English, and always shows you where it got the answer from.
What it does
Drop your PDFs into a web app, ask a question, get a written answer back with every claim linked to the exact passage it came from. You can filter by year or section, and your library stays private to you.
How we built it
Frontend: React + Vite + Tailwind — upload, ask, read. Backend: a small Node/Express server that handles uploads and talks to Gemini. Database: MongoDB Atlas, because it gave us both vector search and keyword search in one place.
Challenges we ran into
"The biggest one: the RAG kept returning the wrong answers. At first we built it the way most "chat with your docs" tools are built — pure semantic search, just comparing the meaning of your question to the meaning of each passage. It sounded smart in theory and it was fine for vague questions, but it kept missing the obvious stuff. Ask for a specific author, a specific number, a specific technical term, and it would hand back passages that were vaguely related instead of the one that literally said the thing.
The fix was to stop relying on similarity alone. We added a second search that looks for exact words the old-fashioned way, ran both searches together, and merged the rankings so a passage wins if either method thought it was good
Accomplishments that we're proud of
It actually cites its sources — every answer links back to a real passage. The hybrid search runs in a single database query, so the whole thing stays small and fast. From hitting enter to seeing a cited answer is a couple of seconds. It runs entirely on free tiers.
What we learned
Pure semantic search isn't enough on its own — you need exact-word search too. The order matters: filter before you rank, not after. Citations aren't a nice-to-have. If the user can't double-check you, the tool is worse. The clever retrieval was the quick part. The UI, the error messages, and the upload flow took just as long.
What's next for Research-RAG
Support more than PDFs (arXiv links, notes, Word docs). Follow-up questions that remember the last one. Better paper metadata via DOI / arXiv / Semantic Scholar lookups. Inline highlighting in the source PDF when you click a citation. Shared libraries so collaborators can ask questions across the same corpus
Built With
- css
- express.js
- geminiapi
- html
- javascript
- mongodb
- react
- vite
Log in or sign up for Devpost to join the conversation.