Project can be found here. The CNN model is not included in the upload due to upload rate limit, but can be found in this repository.
Inspiration
Creativity always beats brute-force. We mimic how our brains look at the pictures, namely: if there's a label, let's read the label. If not: can we recognize the item?
What it does
To start of, we try to extract text from the images by using a library called Tesseract. After filtering the images with text, the remaining images are sent to the pre-trained CNN that uses the VGG-19 model as a base and by adding a few extra layers of our own, we can make the net a better fit for the specific images in the dataset.
How we built it
Like every project: start with some exploration and more importantly: be creative. We started with some drafts and discussions, went on with data analysis and decided to build 2 models and combine them in a clever workflow.
Challenges we ran into
During the analysis we noticed that the quality of photos could be improved. One way could be to let users send their photos through the app. When using the camera, users can see the label that is predicted by the CNN. If correct, the user can confirm this and thereby improving the quality of the data and adding more labeled data to the total dataset.
Accomplishments that we're proud of
Besides the fact that accuracy is quite well, we think we can easily improve it with some more ideas. We're proud that with some effort we can create stuff which is already quite valuable, and could be even more with ideas like the one above.
What we learned
Use cases like these are actually quite doable for a strong AI engine. Combining it in an infrastructure as described above, we think you can get up to 90% accuracy, taking in to account the feedback users give.
What's next for Tesseract meets CNN
- Image augmentation to enhance the training data
- Experimenting with changing layers in the network
- Contairize project and deploy on Kubernetes cluster
- Setup CI/CD environment to make frequent updates possible
- Model monitoring, in order to learn and auto-improve the models
Log in or sign up for Devpost to join the conversation.