Inspiration
The inspiration behind the development of the Heritage Hub System stemmed from the challenges faced by Preservation Partners of the Fox Valley. They confronted a significant issue with a vast collection of historical files that were disorganized and untracked. Recognizing the importance of preserving these valuable records and facilitating access for research purposes, the Heritage Hub System was conceived as a robust solution. Its goal is to efficiently manage archives, bring order to chaos, and ensure the preservation of historical heritage. By addressing this critical need, the Heritage Hub System aims to empower organizations like Preservation Partners of the Fox Valley in their mission to safeguard and share their rich cultural heritage.
Kickstart - Heritage Hub
Problem Statement
Digital Records Management for Museums and Historical Sites
Introduction
Heritage Hub System is a robust solution for organizing and making searchable a vast collection of historical files, including text documents, images, audio recordings, and more. This system is designed to meet the needs of Preservation Partners of the Fox Valley in efficiently managing their archives, preserving historical records, and enabling easy research access.
Technologies Used
| Purpose | Technologies Used |
|---|---|
| Preprocessing Files | Java, Spring Boot |
| MetaData API, Machine Learning Models | Python |
| FilePath( |
Elastic Search |
| Front-end | React.JS, HTML5, CSS3, JavaScript |
Architecture Diagrams
(Visit our GitHub page for More Info!)
Module 1 - Pre-Processing Architecture

Module 2 - File Retrieval Architecture

- Metadata extraction Architecture

Why Our Solution is Better Than Others
Cost-Effective: We don't store files anywhere, so there are no storage costs. Only file paths are stored, reducing operational expenses.
Faster Processing: Leveraging concurrent programming, our system can process files efficiently, reducing user wait times.
Local Storage: Everything is locally stored, eliminating the need for complex cloud infrastructure setup. It ensures the system can run offline without an internet connection.
Lightweight: Our application is lightweight, ensuring it runs smoothly even on modest hardware.
No Security Threats: Files are not exposed in any cloud storage, ensuring the security and privacy of historical records.
Legacy File Support: Our system supports file formats from the 1980s, including .RTF & .doc files.
Rest APIs Exposed
Our system exposes the following RESTful APIs:
- Search: Allows users to search for historical records using various search criteria.
- Update File Path: Enables the updating of file paths in the system.
- Get Metadata: Retrieves metadata associated with historical records.
- Delete: Delete the file from the folder and file data from Elastic Search
Steps to Setup and Run the Application
- Clone this repository to your local machine.
- Do the environment setup as mentioned in the Environment Setup Guide file.
- Configure the system by updating the necessary settings.
- Start the applications (tag_generator.py, KickstartApplication, npm run start(Frontend directory)).
- Access the user interface or interact with the REST APIs.
Remaining Task Estimation
| Remaining Tasks | Estimation Time |
|---|---|
| Total Ready to ship package | 2 weeks |
| Containerization/Dockerization of All Processes | 3 Days |
| Testing on the entire dataset | 4 days |
| Buffer week for any bugs | 7 days |
Future Enhancements and Scope
We have plans to enhance the system with the following features:
- Improved user interface for a seamless user experience.
- Integration with existing systems like Past Perfect for seamless data migration.
- Enhanced file format support and more advanced metadata extraction techniques.
- User feedback mechanisms for continuous improvement.
Your feedback and contributions are welcome to help us expand the capabilities of this archive management system.
Contributors
2023_fall Hackathon
Team
Kickstart
Slack Channel
Built With
- elasticsearch
- java
- machine-learning
- natural-language-processing
- optical-character-processing
- python
- springboot
Log in or sign up for Devpost to join the conversation.