Inspiration

Finding insights in research papers can be slow and overwhelming. I built an AI Research Assistant to streamline the process by enhancing search accuracy, providing quick summaries, and enabling paper-specific question answering.While the core concept of ScholarLens was developed on 20 march, I significantly extended and refined it during the 24-hour hackathon by adding new features, optimizing the system, and integrating additional functionalities specifically for this event."

What it does

ScholarLens is an AI-powered research assistant that:
✅ Finds relevant research papers using FAISS-based semantic search 🔍
✅ Summarizes key insights to save reading time 📄
✅ Answers complex research questions using retrieved context and LLMs 🤖
✅ Allows downloading of retrieved papers for further reading 📥

How I built it

I started with a basic keyword-based search but realized it wasn’t enough. To improve accuracy and efficiency:

  • I used FAISS-based semantic search to retrieve relevant papers even when exact keywords weren’t present.
  • I attempted RAG (Retrieval-Augmented Generation) but faced embedding mismatches (384d vs. 768d), causing failures. Instead, I optimized FAISS indexing for better retrieval.
  • Integrated the Gemini API to handle general research questions beyond the retrieved papers.

Each challenge led to refinements, making the system faster, smarter, and more adaptable.

Challenges I ran into

  • Embedding mismatch issues while implementing RAG (384d vs. 768d). I overcame this by optimizing FAISS indexing.
  • Balancing semantic search accuracy with system performance.
  • Ensuring smooth integration of Gemini API for handling broader research queries.

Accomplishments that I'm proud of

  • Successfully combining semantic search with LLM-based Q&A for better research assistance.
  • Overcoming technical hurdles like embedding mismatches and optimizing the FAISS retriever.
  • Providing a seamless experience with paper retrieval, summarization, and Q&A in one system.

What I learned

  • How to optimize FAISS indexing for better semantic retrieval.
  • The limitations of RAG when working with different embedding dimensions.
  • Effective integration of LLMs for complex question-answering using context.

What's next for ScholarLens

  • Further optimizing the retrieval system for faster and more accurate results.
  • Expanding the assistant to handle multimodal research (e.g., figures, tables).
  • Enhancing the user experience with real-time updates and improved search accuracy.

Built With

  • faiss-(vector-search)-apis:-gemini-api-(for-question-answering)-libraries:-pytorch
  • languages-&-frameworks:-python
  • scikit-learn-version-control:-git
  • streamlit-machine-learning:-hugging-face-transformers-(bart-model)
Share this project:

Updates