Inspiration
We were inspired by Bill.Com's need for receipt matching in a non tedious form. We felt our model would be the perfect fit to achieve their goals.
What it does
It uses OCR to create bounding boxes around the text in the image and extract the key features. We use the key features to match the extracted OCR data to the manually entered receipt information data table. Eliminating the need for a human to scroll and search for the correct transaction.
How we built it
We built it using python as well as a python library called pytesseract. We also embedded a free online OCR API to extract the OCR data quickly.
Challenges we ran into
The challenges we ran into dealt with successfully extracting structured data from the OCR output and building a function that can take that data and match it with the proper row transaction from the data table.
Accomplishments that we're proud of
We were proud that we were able to successfully extract text from the receipt images as well as create bounding boxes around the key features.
What we learned
We learned how to use and implement OCR models as well as how to use different APIs.
What's next for Bill.com Challege
We will continue to grow our knowledge on the subject and build more models relating to OCR.
Built With
- anaconda
- google-colab
- jupyter-notebook
- numpy
- online-ocr
- pandas
- pytesseract
- python
- seaborn
Log in or sign up for Devpost to join the conversation.