Inspiration
My team mate Chris, and I have worked with photoshop and the adobe suite for 7 years, and photography has been an on-going passion of mine. So, this project gave us an opportunity to learn more about image manipulation and how the computer sees digital image formats.
What it does
Our project mostly uses opencv, numpy, pytesseract to read in an image and output text as a string in the order of how it appears in the cartoon.
How we built it
We used opencv to transform the image, mostly through a custom sharpening kernel and threshold modification. Moreover, we applied a grayscale function to clean up any weird BGR interference. Then, we used a comparison model using Hugging Face sentence transformers. We found that the project had good accuracy. For example, our first run on cartoon1 gave us a 94% accuracy.
Challenges we ran into
The hardest part was getting opencv to read curved edges like W's and '...' Moreover, getting the right combination of sharpen and threshold (parameters) was difficult because words would become incoherent. the characters still had to resemble the english alphabet set for opencv. Also, discovering how image manipulation works through kernels was a challenge, but we figured out a specific sharpening kernel to use instead of unsharp mask.
Accomplishments that we're proud of
We are proud that our project hit over 90% accuracy without really training our own model. Moreover, discovering the different image manipulations and customizing a kernel was cool.
What we learned
We learned how image manipulation functions and reading functions work in python using opencv docs. Moreover, we learned about how you compare arrays with tensor, and how pytesseract can convert images to strings.
What's next for Calvin & Hobbes
We would probably improve our project by actually training a deep learning model using the sample images. Since this is our first hack, we used an existing library, but obviously we would achieve better results by training our own AI.
Built With
- numpy
- opencv
- pytesseract
- python
- sentence-recognition
- torch
- visual-studio
Log in or sign up for Devpost to join the conversation.