I am contributing to development of framework which will work on document images and convert it into understandable format for dyslexics. This pipeline includes ● OCR engine trained using LSTM, CRNN family of architectures. ● Image Captioning to convert images in documents to meaningful text captions. ● Also includes AI engine to convert/narrate Chemical reactions , Mathematical formulas, Flow Charts, Graphs, Table Data etc. ● Convert a Obtained text from OCR engine into Speech format. This includes fully-convolutional attention-based neural text-to-speech (TTS) system. The proposed system is in development State so it is not deployed to global community but the demo of this for POC purpose can be found at [https://drive.google.com/file/d/1Uz- CiddIzoe5ysHz1116aHAlQbg3vBBW/view?usp=sharing]. [IIIT-Hyderabad Intranet: 10.2.16.111:8000/]

Share this project:
×

Updates