We saw the brand new Samsung Family Hub smart fridge at the CES 2017, which require manual data log-in for the goods stored inside. We got inspired to create a smart fridge that can automatically log in what's inside the fridge, enable users to access the data remotely and have information recommended for the users based on what they have in the fridge.
What it does
This is an IoT-based smart fridge that uses Computer Vision to automatically log in food, informs the users through text messages of what's stored inside and expiration data, and recommend healthier and better use of user's’ current storage through features like checking nutrition and search for recipes related to some items.
How we built it
We used a button on an Arduino board to emulate the action of “closing the fridge door”. The signal created by the button is sent to a PC through a serial COM port. When PC receives that signal, the kinect camera is triggered to capture a photo of the current status in the fridge. The photo is then compressed and sent to our web server. Our web server is coded on Python+Flask and deployed on Google App Engine Flexible Environment. This web server also contains some logics for responding to Twilio messages, which will be mentioned later. When the web server receives that photo, it puts the photo in Google Cloud Storage. It also keeps some basic image metadata in Google Cloud Datastore database. Then the Google Cloud Vision API is called to analyze the photo and label it by what the item is and which category it belongs to. The labels (coming out of cloud vision api) are then passed to Google KnowledgeGraph API to be further narrowed down to things people would normally put in a fridge. The results coming out of Google KnowledgeGraph are then stored in Google Cloud Datastore database. Now the fridge basically identifies the items that were put in it by automatically capturing and analyzing photos. Every time new items are added to the fridge, Twilio would send a notification through SMS to inform user Users are also able to text Twilio some basic commands to:
- Check what is currently in the fridge
- Check which item is about to pass its expiration date
- Check the nutrition of the food stored
- Search for recipes related to some items
Challenges we ran into
1) Capture the kinect photo with the least noise and incorporated Arduino-based trigger for the photo
2) Integrate the local image capture, python web server, google cloud platform, and twilio together and make them work flawlessly. Specifically, the challenges include the following:
- Image format conversion
- Image compression and processing
- Handling HTTP POST/GET requests between Local and web servers for images as well as web servers and twilio for sending and receiving texts
- Create appropriate database structure to store images and item labels
3) At first, it was really hard to pick the right label from about 10 labels returned by cloud vision api. We used KnowledgeGraph first to narraw the list down to 3-5 labels, and then manually process them according to how “general” or “specific” they are.
4) There were some misleading parts in the documentation of cloud vision api in Python. The URI stated in the doc is not the correct format required by the actual function. We finally figured it out by looking into the C# version of that documentation.
Accomplishments that we're proud of
We finished it early enough to write this :p
What we learned
Learned so much about technical stuffs and non-technical stuffs along the way of development
What's next for Smart Fridge
Computer Vision System
- Better recognition of photos containing multiple items of different categories
- More accurate and systematic labeling of new items
Data log-in/Request methods
- Use speech recognition to log in data, complementary to Computer Vision
- A smarter twilio assistant capable of natural language processing
Data Utilization Features
- Automatically refill necessity through Google Express