TEAM: Arjun -- Backend Development; Kelly -- Frontend Development
Presentation + Demo (Demo at 1:14) : https://youtu.be/0E39KpYYs78
Inspiration
Every day, millions of research papers are published, yet most people, no matter how curious, can't access the knowledge inside them. Google Scholar is overwhelming. ArXiv and PubMed are unfiltered firehoses. Even finding papers is a dense, jargon-heavy process. There is no clear, guided way to begin learning from scientific literature.
We are two high school students who met at a research program and faced the same issue: where to begin.
That's why we built setos.ai.
What it does
We use natural language processing (NLP) and machine learning (ML) to convert any normal research question (e.g., “How does mutation count affect tumor behavior?”) into a step-by-step checklist (“roadmap”) of papers to read.
From there, we use LLMs to create study aids such as:
- Practice questions
- Summaries
- Vocabulary guides
Setos.ai is your one-stop shop for becoming an expert in any research topic.
How we built it
- Website: Python FastAPI backend + React frontend
- Paper database: Supabase with PostgreSQL
- Roadmap creation:
- Uses Sci-BERT embeddings and LLM query expansion to map your question into the same semantic space as millions of research papers
- Finds truly relevant matches without complex keyword searching
- Organizes a personalized learning roadmap using the Kneedle algorithm, citation counts, and publication dates
- Uses Sci-BERT embeddings and LLM query expansion to map your question into the same semantic space as millions of research papers
- Study aids: Powered by the Gemini API (free tier)
Challenges we ran into
We faced significant difficulties with:
- Sourcing high-quality paper data — API rate limits and cloud free-tier restrictions were very limiting
- Roadmap creation — initially tricky, but we settled on cosine similarity + Kneedle for simplicity
In the future, we hope to secure funding to overcome these limitations.
Accomplishments we're proud of
- Built a fully functional prototype that converts research questions into personalized reading roadmaps
- Integrated LLMs for study aids directly from papers
- Developed an efficient semantic matching pipeline using Sci-BERT embeddings
What we learned
- The importance of data quality and pipeline scalability for research-focused apps
- Cloud storage and optimization challenges
- Real-world application of NLP algorithms
What's next for setos.ai
- Increase paper coverage with PubMed, ArXiv, and BioArxiv dumps
- Implement Gaussian Mixture Models (GMM) for more accurate roadmap creation
- Use citation networks to improve suggestion ordering
- Fine-tune LLMs to improve explanations, definitions, and practice questions
- Build a full recommendation system for paper suggestions beyond roadmaps
- Secure funding to transform this prototype into a fully featured tool
Built With
- fastapi
- gemini
- openalex
- postgresql
- python
- semantic-scholar
- supabase
Log in or sign up for Devpost to join the conversation.