Inspiration
The inspiration for this project came from the need to optimize document management for non-profit organizations, specifically for Heritage Square. Their existing workflow involved manual document retrieval and organization, leading to inefficiencies and time-consuming tasks. The goal was to develop an AI-powered assistant that could automate these processes, freeing up staff time for more strategic work like marketing, operations, and grants management.
What it does
- Automated File Retrieval: Using the Google Drive API, the system automatically retrieves all files from Heritage Square’s Google Drive and passes them into a vector database for processing. This significantly reduces the time spent by staff searching for relevant documents, improving overall productivity and efficiency.
- Intelligent Querying Capability: The intelligent querying capability utilizes advanced algorithms to respond to user questions accurately. By integrating the Langchain framework with OpenAI models, the AI ensures that responses are based on the most relevant documents, enhancing the quality of information retrieved.
How we built it
The project was built with a clear focus on automating document management for Heritage Square. The technical architecture was designed with the following components:
- Frontend: Developed using ReactJS, TailwindCSS, and TypeScript to create a user-friendly interface for querying documents.
- Backend: Built using Python for handling AI model integration and communication with the Google Drive API.
- Database: QDrant VectorDB was used to store vector embeddings of the documents, enabling efficient querying.
- APIs: OpenAI and Google Service APIs were used to handle document tagging, vectorization, and intelligent search functionality.
- Cloud Deployment: The entire system was deployed on AWS Cloud for scalability, ensuring that Heritage Square could rely on this solution without performance bottleneck
Challenges we ran into
- Optimizing the AI querying process to retrieve the most relevant documents based on context was major hurdle.
- Another challenge was maintaining a balance between the ease of use and the complexity of the underlying AI system, ensuring that non-technical users could benefit from the intelligent querying features without needing to understand the technical details.
Despite these challenges, the project made significant progress, and We are excited about the future improvements, such as file categorization and enhanced sorting features, to further optimize the system.
Accomplishments that we're proud of
Automated Document Retrieval: Successfully implemented seamless file fetching from Google Drive using the Google Drive API, drastically reducing the manual effort involved in document management.
Efficient Document Vectorization: Developed a system that breaks documents into chunks and converts them into vector embeddings, enabling faster and more precise querying.
Intelligent Query System: Built an AI-powered querying mechanism that retrieves relevant documents based on content, improving accuracy and significantly enhancing productivity.
Scalable Cloud Deployment: Deployed the solution on AWS, ensuring it can handle large volumes of documents and scale as needed, providing a reliable and robust platform for non-profits.
Tag Generation for Documents: Successfully generated automatic tags for documents, laying the foundation for future classification and categorization improvements.
What we learned
Throughout the project, We gained hands-on experience with integrating AI into document management systems. I learned how to efficiently retrieve documents from Google Drive using the Google Drive API, convert those documents into vector embeddings, and implement intelligent querying using OpenAI models. The project also deepened my understanding of deploying AI solutions in a cloud environment like AWS and optimizing the workflow with LangChain for query-based retrieval. I also gained insights into the practical challenges of working with large datasets, vector databases, and handling scalability for non-profits.
What's next for Heritage Square
File Categorization: Implement a robust classification system that organizes documents based on high-level categories like marketing, operations, and grants, making file management even more intuitive.
Advanced Sorting Features: Introduce reorganization and sorting functionalities, allowing users to filter documents by parameters such as creation date, relevance, or type, further streamlining access to important files.
Built With
- amazon-web-services
- cognito
- langchain
- openai
- python
- qdrant
- react
- tailwind
- typescript
Log in or sign up for Devpost to join the conversation.