Inspiration

DocQuery AI was inspired by the need for a tool that makes finding information in PDFs easier. We wanted to create something that could quickly analyze multiple PDF documents and provide clear and human friendly answers. Our goal was to use AI to simplify how people interact with complex and lengthy PDF documents.

What it does

DocQuery AI is a sophisticated chatbot designed to analyze multiple PDF documents simultaneously. Users can upload multiple PDF files, and the AI extracts, processes, and embeds the text. It then allows users to ask questions related to the content of these documents, providing precise answers in real-time.

How we built it

We built DocQuery AI using Python, leveraging Streamlit for the web interface. A key component of our solution was the integration of Google Generative AI for text embeddings, which allowed us to efficiently process and analyze text from multiple PDF documents. Additionally, we utilized FAISS, a vector store, for indexing and retrieval, ensuring fast and efficient document search.

Challenges we ran into

One of the biggest challenges was selecting the correct and most accurate AI model and optimizing it. We also faced difficulties in ensuring the accuracy and speed of the answer generation process. Additionally, integrating different AI components posed technical hurdles that required careful debugging and testing.

Accomplishments that we're proud of

We are proud to have developed a robust AI-driven tool that successfully processes and analyzes multiple PDF documents. Achieving high accuracy in answer generation and making the tool user-friendly were significant milestones.

What we learned

Through this project, we deepened our understanding of AI applications in document analysis and natural language processing. We learned how to optimize AI models for specific tasks and integrate multiple technologies seamlessly.

What's next for DocQuery AI

Moving forward, we aim to enhance DocQuery AI with more advanced AI capabilities, such as better handling of diverse document formats beyond PDFs. We plan to integrate more sophisticated question-answering models and improve the scalability of our solution.

Built With

  • faiss
  • googlegenerativeaiembeddings
  • langchain
  • langchain-ai-libraries:-sentence-transformers
  • python
  • streamlit
Share this project:

Updates