There are definitely some weird things going on in this graph. This inspired us to look into fraud!
See that little blue line at the bottom? That's how many loans the SBA processed in an average day in 2019. Now look with the PPP in 2020.
There's more than 2 orders of magnitude of difference between 2019 & 2020. (Same blue line as earlier).
Clearly there are a few ... unusual ... PPP applicants, and reviewing their applications again may be worthwhile
My mom runs a small afterschool program. She often works 11 hour days, driving to schools to pick up kids, buying supplies for the teachers & students, and interacting constantly with parents. It's a demanding job, but one that she loves because she's able to help these kids grow into incredible human beings.
When COVID-19 hit, her business was one of the first to shut down. At the time, I was confused, but looking back oh-boy was that a good call. She struggled to pay her teachers and we weren't in a good position. Things were looking rougher than they had in years.
Until the PPP came along. With the money from the PPP loan, she would be able to pay her teachers for months. Although her doors literally stayed shut, we would be able to keep them open for months while we wait for a recovery.
28 minutes after my dad got the email for the PPP, he signed up. 28 minutes. It takes more time for the average American to go to work than it did for him to sign up for that email.
But my were those 28 minutes consequential. Because of them, we had to wait two months as we slowly progressed through a massive queue, hoping we would be able to sustain ourselves for long enough in the meantime.
We were fortunate. Before things got too tense, we were able to navigate the bureaucracy and get the funds we needed. But I still wondered, what took so long?
As I began to explore the PPP data I found, I became curious in understanding the fraud associated with some companies in the program. I'd heard about isolated incidents of people abusing it to treat themselves to luxury. So, I decided to pursue this, and see where it would take me.
What it does
We built a tool that does anomaly detection on incoming PPP applications to identify which ones are likely to be fraudulent.
We also built some analytics infrastructure to understand the data we had as well.
How we built it
We have everything on Google Colab, ready to go. The link is below.
Challenges we ran into
We had a host of problems, with people leaving the team in the middle of the hackathon, with trying to source data, with defining a direction to go down. I'm glad we were able to build relationships, persevere through the disagreement, and hopefully reach the other side of the tunnel!
Accomplishments that we're proud of
We're very proud of finishing our first-ever datathon, to be honest. We're proud of the work we've done, and helping to ensure that more money is available for those who really need it.
What's next for Identifying Potential Fraud With PPP
As we progress, we'd love to include more features and granularity with our data. Here, we were constrained by the limits of Colab. With some operations on the cloud, we may be able to do much better.
It's also be great to give this tool to people running the PPP so they have a backup that ensures they don't miss glaring errors in PPP applications or usage.