Inspiration
MLH GHW has hosted day4 challege for AI-ML which motivated me to go and work for a text recognition model with python that works on pyTesseract. I found a video on Youtube and went along with it. it was quite fun to watch and code along.
What it does
It basically outputs the information hidden inside an image in the textual form. It also print outs the position of that text information and the level of surety with which the model predicts that text.
How we built it
we built a neural model that recognize the text from an image using some data. We first collected 4gb worth of Image and Data which showcase the info of image and the text present in it is also there. First the model is trained in 80% part of the data and rest is for testing. Just simple commands and prebuilt function were used from pyTesseract
Challenges we ran into
The main problem was the training Data, Which was quite huge. Not only did that containt the image, but it also stored the text, location (absolute to image axes), etc.
Accomplishments that we're proud of
The model is constanatly producing 85-90% confidence level textual information which is quite an ahcheivement in its own.
What we learned
We Learnt the main use of PyTesseract and some other libraries like easy_ocr and keras_ocr. How to make model and efficiently train it, more than that, how to find data and cleaning it also cotributed a huge part in the success of this project
What's next for text-recognize
Maybe we can integrate it with some website or any machine Or just use it for VEHICLE plate number detection.

Log in or sign up for Devpost to join the conversation.