First question is first: Why is it called I_Hate_Nautilus?
At 12 AM, I discover that my model's accuracy is oscillating (more on that later). I am not sure if that is caused by my dataset or the structure of the network, so I quickly adapted the network to differentiate cats and dogs instead of unhealthy food and healthy food. I downloaded a large (30,000 images for training!) dataset, and then tried to put the dog photos and cat photos in separate directories (Each photo was prefixed with which animal was the subject). I make the grave mistake of trying to move the files through Nautilus, the default file explorer for Ubuntu. Unfortunately¸Nautilus freezes when attempting to move thousands of files. I didn't realize it had froze until 6 hours had passed (during which I was working on something else).
Inspiration
One day, I was watching a VlogBrothers' video. At one point, John says something along the lines of
The key to dieting is very simple: Eat less brown stuff, eat more green stuff
Pictures of brown, unhealthy food (burgers, a variety of fried food, etc) and pictures of green, healthy food (lettuce, cucumber, kale, cabbage, etc) filled the screen to prove his point.
What it's supposed to do
It is supposed to run a Convolutional Neural Network(CNN) on images of food. It would be trained on a dataset containing brown, unhealthy foods marked as unhealthy, and green, healthy foods.
A user would then be able to send an image into a server through some sort of UI, and the server would have the model predict whether or not the food was healthy or not, and then send the result back to the user
What it actually does & Challenges I Experienced
There were 3 large issues with the original configuration:
1. Dataset does not contain correct images
Initially, I used the Flickr API to fetch images of "unhealthy food" and "healthy food". This was a bad idea largely because the images on flickr often focused on other subjects (people handling the food, primarily).
2. Dataset too small
Only ~1,000 images from Flickr were used to train the CNN. This is too small of a dataset to get good results.
3. Structure of the model causes accuracy to oscillate
Accuracy oscillation is when the accuracy goes up and down as more training steps pass. This means that the model won't become more and more accurate as time goes on. This is bad because it makes it difficult to reach a high accuracy
How I built it
This was built with Tensorflow and Python 3.6. A dataset containing thousands of images of cats and dogs from Kaggle was involved.
Did your model work with the cats and dogs?
No. It still suffered from oscillation.
Accomplishments that I'm proud of
I'm proud of having learned something from this experience
What I learned
I learned a great deal of new things while working on this project
- Don't trust Nautilus
- What oscillation is and why it's bad
- How a CNN works and how to implement one in TensorFlow.
- What batch normalization is and how to implement it in TensorFlow
- How to use TensorBoard, a powerful visualization tool for TensorFlow
- What dropout is and how to implement it in TensorFlow
What's next for I_Hate_Nautilus
Probably a complete rewrite and more time spent on collecting data to train the neural network on.
Built With
- python
- tensorflow
Log in or sign up for Devpost to join the conversation.