Payment artifacts such as invoices take quite some human effort to review and process for payments. If OCR and allied technologies can be leveraged to make that process more efficient, less error-prone and requiring much less human effort, it could address a gap in the market where more and more services dont have to spend enormous amounts for payments/documents processing
What it does
The vision (a schematic is attached) is to automate end-to-end process as outlined above, where adding an invoice (image file) would trigger a process which would extract relevant fields using Form Recognizer and persist to both Graph/Table API(s) of CosmosDB. That would allow for REST APIs to expose the parsed invoice's fields for either further review or process it further for payment processing.
Given a limited time-budget(less than 25 hours) , only the invoice parsing and persisting to CosmosDB is complete.
The reason why Graph as well as Table API(s) were used were for two different purposes? Table API(s) provide schema-less facility with certain SQL characteristics while the Graph API provides ways to probe relationships between vendors and customers.
How we built it
Initial ideation was very elaborate but quickly settled on something tractable with limited time budget of less than 30 hours. Built using Python/Azure SDKs. Most orchestrations are not automated and have been done using the Azure portal.
Challenges we ran into
Mostly trying out the API(s) and linking them to work as a process.
Accomplishments that we're proud of
Automated process from storage to OCR to persistence. Just the API to expose the parsed invoices is missing.
What we learned
Its pretty simple to use the Azure SDK(s) compared to the other major cloud platforms. Automated orchestration facilities is still lagging compared to other providers.
Some of the Azure API/SDK seem to be in flux and AI/SDK signatures keep changing which requires more effort to refactor.
What's next for Automated Invoice processor
A server-less set of functions to run this process on a trigger. Linkages to Anamoly detector as well as
Personalizer would be ideal.
Integration with Azure Communication services for automated payment authorization based on confidence metrics of Form Recognizer would also be ideal.