📖 About ScholarAI 🌟 Inspiration
As a postgraduate student, I often struggled with bulky theses, journal articles, and textbooks. Extracting key insights, comparing studies, and formatting references could take weeks. I wanted to create a tool that helps students and researchers work smarter by instantly generating summaries, answers, and citations from academic documents.
This inspired ScholarAI — an AI-powered academic assistant built to simplify research and writing.
📚 What We Learned
How to combine Elastic AI Search with Google Vertex AI (Gemini models) for hybrid document retrieval and summarization.
The importance of prompt engineering to ensure concise, academic-quality outputs.
That students value not just summaries, but also properly formatted citations and comparisons across studies.
Handling structured queries (like "What research gap is identified?") required fine-tuning search + LLM pipelines.
🛠️ How We Built It
Document Ingestion: Academic PDFs (theses, journals, reports) are uploaded.
Indexing with Elastic: We used Elastic Search to store, chunk, and index text for fast semantic + keyword retrieval.
AI Summarization with Vertex AI: Relevant chunks are passed to Google Gemini models for summarization, Q&A, and citation extraction.
Output Formatting: Results are structured into:
Summaries (bullet points)
Comparisons (tables across documents)
Citations (APA/MLA auto-generated)
Workflow Equation:
𝑆 𝑐 ℎ 𝑜 𝑙 𝑎 𝑟 𝐴
𝐼
𝐸 𝑙 𝑎 𝑠 𝑡 𝑖 𝑐 ( 𝑆 𝑒 𝑎 𝑟 𝑐 ℎ + 𝑅 𝑒 𝑡 𝑟 𝑖 𝑒 𝑣 𝑎 𝑙 ) + 𝑉 𝑒 𝑟 𝑡 𝑒 𝑥 𝐴 𝐼 ( 𝑆 𝑢 𝑚 𝑚 𝑎 𝑟 𝑖 𝑧 𝑎 𝑡 𝑖 𝑜 𝑛 + 𝑄 & 𝐴 + 𝐶 𝑖 𝑡 𝑎 𝑡 𝑖 𝑜 𝑛 ) ScholarAI=Elastic(Search+Retrieval)+VertexAI(Summarization+Q&A+Citation) 🚧 Challenges We Faced
PDF Parsing: Extracting clean text from different thesis formats (with tables, references, or images) was tricky.
Context Window Limits: Long documents exceeded LLM token limits, so we had to chunk and merge intelligently.
Citation Accuracy: Getting the AI to output citations in correct APA/MLA formats required repeated prompt tuning.
Time Constraint: Building a functional prototype within hackathon time pushed us to focus on core features first.
✅ Outcome
ScholarAI now allows users to:
Upload academic documents.
Ask natural questions like “Summarize Chapter 2” or “What methodology was used?”.
Instantly get human-like summaries, Q&A, and properly formatted references.
This project proves that AI can save students weeks of manual work, making research more efficient, accessible, and enjoyable.
Built With
- backplane-javascript
- docker
- elasticsearch
- elasticsearchapi
- flask
- gcp
- github
- markdown
- python
- q&a
- react
Log in or sign up for Devpost to join the conversation.