Since I was young, I've had a passion for all things related to rocks, fossils, and earth science. I even wrote my college essay about it. Unfortunately, classifying stones, particularly minerals, is traditionally a very hands-on process. This means that correctly classifying minerals and rock types typically requires extensive training and access to testing materials, which inspired me to implement machine learning to make classification faster and easier for the layman.
My final project is a web app that classifies images of stones, etc. into four different classes: rock, fossil, mineral, gemstone.
Building the Classifier
I built my classifier using Tensorflow and Keras to train a sequential convolutional neural network. Then, I used Flask to build a web app that can be run on a locally hosted server from the command line.
I faced enormous challenges from the beginning. I committed myself to building a classifier for rocks, despite the fact that no suitable dataset existed. I built my own dataset by web scraping the Smithsonian's NMNH Geology Collections Data Portal. I ran into a horde of problems related to the volume of data, labelling, processing, and partitioning my data. Creating my own dataset was incredibly time consuming and resulted in a lot of frustrating set backs, as well as the pure time and effort that was involved. I also ran into problems building my actual CNN model, when I realized that still my data was not properly labelled or separated. I found myself taking one step forward and two back quite often.
I ran into problems building the flask app as well, because I run python 3.8 but Tensorflow requires python 3.5-3.7.
Creating a working, sizable Kaggle dataset from my own bare binaries was a thrilling accomplishment at the end of it. I especially proud of how successful my CNN was considering I was working with a less formal dataset. My final model boasted a test accuracy of 93.8%.
What I learned
This course and this project encouraged me to learn more about time management, expectations, and organization than I could have anticipated walking into it. I also became more familiar with libraries such as Beautiful-Soup, Flask, and Requests. Pipenv was a particularly fun hurdle to tackle in order to get Tensorflow up and running on my local machine for the Flask app. Of course, I learned a lot about building a CNN and the importance of using the right kind of machine learning model. I also feel more proficient in my git and terminal abilities.
What's next for Stone Classifier
Hopefully in the future, I can delve further into the world of web scraping and build a bigger, better dataset. I'd like to have more specific classes and get closer to my original aspirations for this project.