Inspiration

BILL subscribers are able to make purchases with our corporate card solution. When a purchase is made, users can fill in information about the purchase, such as the date of the purchase, the amount, and the vendor name. To validate the purchase, a receipt must be attached. Therefore reconciling all transactions is a time-consuming process. To alleviate this time-consuming task of matching uploaded receipts with past transactions, we challenge you to design a Receipt Matching algorithm to save customers’ time.

What it does

It helps to find the Receipt Matching with text and images.

How we built it

We build a transfer learning model to extract the text from receipt image. Then we used pipeline and BERT-based models to calculate the similarity.

Installation Environment


Development Environment: Linux cinnamon 20.2

System Installation:

        sudo apt update
        sudo apt install tesseract-ocr
        sudo apt install libtesseract-dev
         sudo apt-get install git lfs pull

Pull the bert pretrained model from git lfs

There are also some Python-pip libraries.

        pip3 install spacy

        python3 -m spacy download en_core_web_sm

        pip3 install torch==1.8.0 torchtext==0.9.0

        pip3 install sentence_transformers

        pip3 install pytorch_pretrained_bert

Challenges we ran into

The main challenges are to extract the texts and find similarities.

Accomplishments that we're proud of

By each epoch, we reduced the error and gain nearly about 91% accuracy.

What we learned

Transfer Learning model implementations, training, and testing.

What's next for BillNet: Receipt Matching

Can be worked with large dataset and optimized BERT.

Built With

Share this project:

Updates