Inspiration
BILL subscribers are able to make purchases with our corporate card solution. When a purchase is made, users can fill in information about the purchase, such as the date of the purchase, the amount, and the vendor name. To validate the purchase, a receipt must be attached. Therefore reconciling all transactions is a time-consuming process. To alleviate this time-consuming task of matching uploaded receipts with past transactions, we challenge you to design a Receipt Matching algorithm to save customers’ time.
What it does
It helps to find the Receipt Matching with text and images.
How we built it
We build a transfer learning model to extract the text from receipt image. Then we used pipeline and BERT-based models to calculate the similarity.
Installation Environment
Development Environment: Linux cinnamon 20.2
System Installation:
sudo apt update
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev
sudo apt-get install git lfs pull
Pull the bert pretrained model from git lfs
There are also some Python-pip libraries.
pip3 install spacy
python3 -m spacy download en_core_web_sm
pip3 install torch==1.8.0 torchtext==0.9.0
pip3 install sentence_transformers
pip3 install pytorch_pretrained_bert
Challenges we ran into
The main challenges are to extract the texts and find similarities.
Accomplishments that we're proud of
By each epoch, we reduced the error and gain nearly about 91% accuracy.
What we learned
Transfer Learning model implementations, training, and testing.
What's next for BillNet: Receipt Matching
Can be worked with large dataset and optimized BERT.
Built With
- python
- tensorflow
Log in or sign up for Devpost to join the conversation.