Inspiration
Finding insights in research papers can be slow and overwhelming. I built an AI Research Assistant to streamline the process by enhancing search accuracy, providing quick summaries, and enabling paper-specific question answering.While the core concept of ScholarLens was developed on 20 march, I significantly extended and refined it during the 24-hour hackathon by adding new features, optimizing the system, and integrating additional functionalities specifically for this event."
What it does
ScholarLens is an AI-powered research assistant that:
✅ Finds relevant research papers using FAISS-based semantic search 🔍
✅ Summarizes key insights to save reading time 📄
✅ Answers complex research questions using retrieved context and LLMs 🤖
✅ Allows downloading of retrieved papers for further reading 📥
How I built it
I started with a basic keyword-based search but realized it wasn’t enough. To improve accuracy and efficiency:
- I used FAISS-based semantic search to retrieve relevant papers even when exact keywords weren’t present.
- I attempted RAG (Retrieval-Augmented Generation) but faced embedding mismatches (384d vs. 768d), causing failures. Instead, I optimized FAISS indexing for better retrieval.
- Integrated the Gemini API to handle general research questions beyond the retrieved papers.
Each challenge led to refinements, making the system faster, smarter, and more adaptable.
Challenges I ran into
- Embedding mismatch issues while implementing RAG (384d vs. 768d). I overcame this by optimizing FAISS indexing.
- Balancing semantic search accuracy with system performance.
- Ensuring smooth integration of Gemini API for handling broader research queries.
Accomplishments that I'm proud of
- Successfully combining semantic search with LLM-based Q&A for better research assistance.
- Overcoming technical hurdles like embedding mismatches and optimizing the FAISS retriever.
- Providing a seamless experience with paper retrieval, summarization, and Q&A in one system.
What I learned
- How to optimize FAISS indexing for better semantic retrieval.
- The limitations of RAG when working with different embedding dimensions.
- Effective integration of LLMs for complex question-answering using context.
What's next for ScholarLens
- Further optimizing the retrieval system for faster and more accurate results.
- Expanding the assistant to handle multimodal research (e.g., figures, tables).
- Enhancing the user experience with real-time updates and improved search accuracy.
Built With
- faiss-(vector-search)-apis:-gemini-api-(for-question-answering)-libraries:-pytorch
- languages-&-frameworks:-python
- scikit-learn-version-control:-git
- streamlit-machine-learning:-hugging-face-transformers-(bart-model)
Log in or sign up for Devpost to join the conversation.