ScholarLens

Inspiration

Finding insights in research papers can be slow and overwhelming. I built an AI Research Assistant to streamline the process by enhancing search accuracy, providing quick summaries, and enabling paper-specific question answering.While the core concept of ScholarLens was developed on 20 march, I significantly extended and refined it during the 24-hour hackathon by adding new features, optimizing the system, and integrating additional functionalities specifically for this event."

What it does

ScholarLens is an AI-powered research assistant that:
✅ Finds relevant research papers using FAISS-based semantic search 🔍
✅ Summarizes key insights to save reading time 📄
✅ Answers complex research questions using retrieved context and LLMs 🤖
✅ Allows downloading of retrieved papers for further reading 📥

How I built it

I started with a basic keyword-based search but realized it wasn’t enough. To improve accuracy and efficiency:

I used FAISS-based semantic search to retrieve relevant papers even when exact keywords weren’t present.
I attempted RAG (Retrieval-Augmented Generation) but faced embedding mismatches (384d vs. 768d), causing failures. Instead, I optimized FAISS indexing for better retrieval.
Integrated the Gemini API to handle general research questions beyond the retrieved papers.

Each challenge led to refinements, making the system faster, smarter, and more adaptable.

Challenges I ran into

Embedding mismatch issues while implementing RAG (384d vs. 768d). I overcame this by optimizing FAISS indexing.
Balancing semantic search accuracy with system performance.
Ensuring smooth integration of Gemini API for handling broader research queries.

Accomplishments that I'm proud of

Successfully combining semantic search with LLM-based Q&A for better research assistance.
Overcoming technical hurdles like embedding mismatches and optimizing the FAISS retriever.
Providing a seamless experience with paper retrieval, summarization, and Q&A in one system.

What I learned

How to optimize FAISS indexing for better semantic retrieval.
The limitations of RAG when working with different embedding dimensions.
Effective integration of LLMs for complex question-answering using context.

What's next for ScholarLens

Further optimizing the retrieval system for faster and more accurate results.
Expanding the assistant to handle multimodal research (e.g., figures, tables).
Enhancing the user experience with real-time updates and improved search accuracy.

Built With

faiss-(vector-search)-apis:-gemini-api-(for-question-answering)-libraries:-pytorch
languages-&-frameworks:-python
scikit-learn-version-control:-git
streamlit-machine-learning:-hugging-face-transformers-(bart-model)

Updates

Itisha Choudhary started this project — Mar 23, 2025 09:06 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.