Inspiration
As developers, a significant portion of our time is dedicated to navigating documentation to grasp concepts, troubleshoot errors, and enhance code efficiency. However, meticulously going through every page of documentation can be challenging. Therefore, we introduced DocuConvo AI, a tool enabling developers to directly inquire about any information from the documentation, receiving prompt and comprehensive answers to address their queries efficiently.
What it does
DocuConvo operates in the following steps:
Crawling Documentation Website:
Our application crawls the entire documentation website provided by the organization. Creating Knowledge Base:
The crawled information is processed and converted into vector embeddings. Vector embeddings are saved into the Pinecone vector database as an index. Search Process:
When a search request is received from the organization's search bar, it is compared against the knowledge base using vector embeddings. Similar vectors are passed to GPT3.5 as context, along with the search query.
How we built it
Docuconvo consists of 3 components as of now: 1. Our Core API that efficiently crawls and responds to search queries on documentation websites.
- The second part is a website that facilitates project creation, enabling organizations with documentation websites to request crawls and generate an AI-powered knowledge base.
- The third component is an API client/SDK designed for seamless interaction with the API for querying search questions.
Challenges we ran into
Our crawler faced challenges in handling concurrent requests when the API received multiple simultaneous requests. To address this, we implemented a job scheduling architecture, allowing requests to be queued and scheduled in the backend, enabling asynchronous handling. Additionally, we encountered deployment issues with the crawler/Turborepo, requiring extensive debugging and the acquisition of new skills such as dockerization of the application.
Accomplishments that we're proud of
We are proud to affirm that DocuConvo effectively manages the load for crawling websites with over 500 pages of documentation and seamlessly handles the storage of this extensive knowledge base.
What we learned
We gained insights into dockerization, implementing a scalable job scheduling/queuing system capable of handling increased loads. This knowledge enhanced our ability to generate improved responses from AI and provided a clearer understanding of building a scalable and reliable crawler.
What's next for Docuconvo
Our upcoming milestone involves offering documentation owners user-friendly search components embedded with DocuConvo search. These components can be effortlessly copied and pasted into projects, ensuring seamless connectivity. Additionally, we plan to implement a real-time feature that displays to users the pages currently being crawled.
Log in or sign up for Devpost to join the conversation.