Inspiration

MLH GHW has hosted day4 challege for AI-ML which motivated me to go and work for a text recognition model with python that works on pyTesseract. I found a video on Youtube and went along with it. it was quite fun to watch and code along.

What it does

It basically outputs the information hidden inside an image in the textual form. It also print outs the position of that text information and the level of surety with which the model predicts that text.

How we built it

we built a neural model that recognize the text from an image using some data. We first collected 4gb worth of Image and Data which showcase the info of image and the text present in it is also there. First the model is trained in 80% part of the data and rest is for testing. Just simple commands and prebuilt function were used from pyTesseract

Challenges we ran into

The main problem was the training Data, Which was quite huge. Not only did that containt the image, but it also stored the text, location (absolute to image axes), etc.

Accomplishments that we're proud of

The model is constanatly producing 85-90% confidence level textual information which is quite an ahcheivement in its own.

What we learned

We Learnt the main use of PyTesseract and some other libraries like easy_ocr and keras_ocr. How to make model and efficiently train it, more than that, how to find data and cleaning it also cotributed a huge part in the success of this project

What's next for text-recognize

Maybe we can integrate it with some website or any machine Or just use it for VEHICLE plate number detection.

Built With

Share this project:

Updates