Inspiration

We were inspired by Bill.com's ability to connect agencies with each other and recognize the ability for other agencies to learn from one another, even across disparate topics and solutions.

What it does

For each agency, we identify the most similar agency to that agency, then compare their previous purchases and recommend new vendors that the former agency could make use of based on the spending habits of their peers.

How we built it

We use a version of TF-IDF with normalization, adopted for vendors and agencies, to describe how similar different agencies are to one another based on the cosine similarity between their TF-IDF matrices, with each row being an agency and each column being a type of vendor. TF typically stands for Term Frequency, and is a simple count, but instead we use normalization of the min-max transaction amounts for each vendor. Our matrix is able to represent how frequently agencies purchase from vendors, without being overly influenced by the expense of that purchase (e.g. a small purchase from Delta is in most cases much larger than a purchase from Staples, but not more indicative of a trend). We then use this TF-IDF to find the most similar agencies to one another, making use of the cosine-similarity between each matrix.

Challenges we ran into

We noticed that there are a small number of refunds in the dataset, and we think there may be a small impact on our normalization step by these refunds. We chose to drop them rather than incorporate them back into the dataset, due to their stochastic nature and difficulty to handle. We do not expect large differences in the data, but there may be some small changes.

Accomplishments that we're proud of

We're most proud of the scalability and the speed of our solution, although it handles nearly half a million files the entire process takes under a minute to fully process and output the results. Moreover, our code is easily scalable with a new dataset in order to increase its effectiveness.

What we learned

We learned a lot about the flow of transactions and purchases in government agencies, we would never have anticipated there would be such connections across agencies and are fascinated about ways that Bill.com can leverage this information to build relationships and collaborations around the world.

What's next for Bill.com the Flow of Money

Next up is that the dataset should be expanded, since our method has a direct relationship with the amount of data presented to it.

Share this project:

Updates