Inspiration

Preservation Partners of the Fox Valley runs five museums and historical sites. They have computer files and documents dating back to the 1980s, and they desperately need to be reorganized in a sustainable and searchable way. They struggle to do research because it is difficult to find any digital records.

Impact

Our solution is designed to address 100% of the identified challenges faced by Preservation Partners. We made a search engine using the latest innovations in NLP and LLMs to make it possible for the user to search for not just text but images, files, and audio as well. Our solution minimizes human labor by auto-captioning images and automatically transcribing audio so that it is easily searchable.

Documentation

Everything is properly documented.

Complexity

We understand the importance of solving complex issues, and we're using advanced Language Models (LLMs) to make things easier. Our solution doesn't just skim the surface – it goes deep into the details of organizing and finding digital records effectively. We're working on a system to help Preservation Partners:

  1. Find specific documents, images, and more quickly.
  2. Organize records in a way that makes sense to them.

For example:

  • Easily search for documents mentioning a person, event, or place.
  • Find images and documents on specific topics.
  • Locate newspaper articles about local history.
  • Sort historical photos by date, location, or subject without hassle, thanks to LLMs.

Security

We've taken thorough measures to seal any possible security vulnerabilities, preventing unauthorized data access. Our system guarantees the safety and secrecy of Preservation Partners' precious records.

Moreover, we've established a strong security model based on roles. By using minimal IAM role permissions, we maintain tight control over access to sensitive data, following predefined roles. This adds an extra layer of security, building on the existing AuthO framework.

How we built it

Our solution for Preservation Partners' digital record organization and search engine was developed using state-of-the-art technologies and methodologies. Here's an overview of how we built it:

  1. Data Ingestion: We securely ingested Preservation Partners' historical records into our system.

  2. NLP and LLM Integration: We used advanced Language Models (LLMs) to build a powerful search engine capable of searching text, images, files, and audio.

  3. Automation: Our system auto-captions images and transcribes audio, enhancing searchability and user experience.

  4. Frontend and Backend Development: We created a user-friendly web app with a responsive frontend and an efficient serverless backend.

  5. Documentation: We provide comprehensive documentation for users and developers.

Accomplishments

We're proud of several accomplishments:

  1. Comprehensive Solution: Our system addresses all of Preservation Partners' challenges in digital record management.

  2. Advanced Search: The search engine handles various content types, providing an unparalleled search experience.

  3. Automation: Automatic image captioning and audio transcription reduce manual effort.

  4. Security: Robust security and minimal IAM role permissions ensure data safety.

What we learned

Throughout the project, we gained valuable insights:

  1. Advanced Tech: We learned about NLP, LLMs, and their real-world applications.

  2. Data Management: Managing extensive data collections requires careful planning.

  3. Security: Implementing effective security measures is essential.

  4. Collaboration: Effective teamwork is crucial for project success.

What's next

  1. We plan on further fine-tuning our models to improve their search capabilities.
Share this project:

Updates