We worked together creating Intelligent RPA Product and thought of how we can make use of automation to help auditing process be easier, reducing manual work and increasing time for higher value task.
What it does
iMatch by Team RIPA automates the matching of invoices from images into records. Built with further improvements in mind on how we can make it into a startup idea.
How we built it
We built it with OCR technology to extract words from images. Then we did some minor Data analysis on how the extracted information will look like. Further with we filter the extra spaces and newlines to get a list of strings from the image that contains readable words or numbers. We then experiment with different scoring and matching algorithms such as Levenshtein distance, Hungarian algorithm, sequence matching libraries. We finally decided on a greedy approach of getting the highest score and matching it as Hungarian may cause different or similar invoices to be matched and replacing the perfect match due to how it is meant to get the highest score.
Challenges we ran into
OCR technology is not fully mature. Hence, we face the issue of getting inaccurate data from the invoice. Also, it is difficult to differentiate similar invoices. Inexperience in both the technological and auditing fields to understand how to make use of technology to better fit in the audit fields.
Accomplishments that we're proud of
One thing we are proud of and is unique to our product is the matching algorithm. Instead of having a single greedy approach that takes the most similar value, we use an algorithm for matching such that it maximizes the total score of the invoice matching process. Thus, making our product more robust. Furthermore, we took the approach of a client and servers instead of having a one-stop machine learning script to make our solution more scalable. In the future, if our product takes off, we can add a messaging queue or deploy on cloud services to help make our solution more robust.
What we learned
Throughout the hackathon, we have immensely benefited from learning more about the accounting industry. Also, we learned about natural language processing and difficulties in information retrieval. Another valuable takeaway would the research from the matching algorithms that we have used. This has certainly benefited us professionally and helped us hone our business acumen.
What's next for RIPA
Customize OCR to better fit Invoices Support Users by providing suggestion instead of just auto-matching improving the accuracy of the match Machine learning implementation to matching invoices from user’s feedback Following which when we have high enough accuracy, fully automated and error recognition. Selling it as a cloud service or as an on-premise set up for users that are cautious of security and privacy issues.