In an effort to inspire more Clemson students to engage with historic landmarks around campus, we aimed to design a scavenger hunt that leads students through a series of clues.

What it does

When arriving at a landmark, the user will go to the friendly UI designed at . There is a link that directs the user to take a picture of the monument they believe the clue is pointing to. When the picture is taken, it is sent through the frontend server to an AWS EC2 instance hosting our pre-trained image classification model. That model will classify the image they have taken and send a response back to the user with the next clue.

How we built it

The model was trained on Clemson's Palmetto Cluster. Due to the small dataset size, a transfer learning approach was taken. Google codelab's tensorflow-for-poets network, which has previously been trained to classify over 1000 different classes, was used as the initial network. This neural network was then retrained by adding features into the intermediate neurons that relate to our images, thus allowing a network to classify images of our different landmarks with higher accuracy without over-fitting the model to as high a degree.

This trained classification model was then placed into an AWS EC2 instance and exposed to allow javascript Ajax requests. A series of servers were connected , including a React frontend server for a clean UI, that would pass the image file to the network and then pass a response back to the user. These servers were all contained in Docker containers and mounted alongside one another using Portainer container orchestration.

Challenges we ran into

Docker is not always intuitive when it comes to connecting servers to one another, so data transfer from and then back to the user became a serious challenge. Furthermore, the model was difficult to train given so few training images. Using transfer learning was certainly better than training an entirely new network, however the network still faced the issue of overfitting. Lastly, iOS images are stored as .HEIC files, and these are incompatible with most linux distributions. Batch conversion was a difficult task.

Accomplishments that we're proud of

Despite these challenges, we have been able to train our model to classify our images correctly between our 6 different landmarks that will be included in the scavenger hunt. This was a fascinating use of data as we were able to effectively train a model using small datasets by leveraging a pretrained model! We also were able to write a Python script to loop the user through the scavenger hunt, moving through a progression from one landmark to another. Finally, the networking tools used to connect the servers to the EC2 instance for data transfer have been extremely complex and we have managed to navigate through this while arriving at a functional product.

What we learned

This experience has taught us all a great deal about Machine Learning and dataset management, data transfer, networking, and Docker containerization.

What's next for CUFindIt

We hope to increase our dataset size to generate a better network for image classification.

Share this project: