I have been working with machine learning models for some time, with the large datasets required, one has to keep deleting some of the data to create more space. Cortx allows one to train machine learning models with massive data stored on the Cortx-s3.
What it does
Create an integration where one can fetch data from Cortx and feed it directly to a Neural Network using PyTorch.
How we built it
Installed Cortx ova image on virtual box VM, creating Cortx-s3 account, connect the s3 data to a machine learning model running on the local machine.
Challenges we ran into
Copying and Moving objects in the s3 is not implemented, I had to change some steps on the original idea.
Accomplishments that we're proud of
Being able to create a custom Dataset Loader that loads data directly from the s3 and feeds the data to a densenet201 model
What we learned
- How to use boto3.
- Creating custom PyTorch Dataset Loader for the data store in the s3 environment.
What's next for Integrate PyTroch and Cortx
For this setup, we trained on sample data, try with more datasets to make the model more efficient.