Considering State Farm's involvement with this hackathon and all of our team having a data analytics background we found this database an interesting one to analyze and represent visually.
What it does
We rank the financial institutions in the database according to consumer complaints filed against them. We also build an interactive map to see where complaints have been made and which regions are more prone to having financial complaints.
How we built it
We used several libraries, such as pandas, numpy, folium, uszipcode, and scikit-learn to build ratings and everything else. Please see our four Jupyter Notebooks on GitHub, where we show how we processed the data (2 million entries)
Challenges we ran into
We couldn't build a choropleth map as initially planned to because our data uses zip codes to show where complaints have been made. We were able to extract polygons and coordinates for each zip code, but it was impossible to convert all that data into a geoJSON file to build zip code lines. Therefore, we settled to only coordinates and markers to show where complaints take place.
For the ranking system, we had to be creative and adjust the evaluation system accordingly. We used scikit-learn to normalize our ranking system.
Accomplishments that we're proud of
We're proud of communicating well and finishing such a vast project with just two people in the team.
What we learned
We learnt how to deal with pressure and time manage in situations with time crunch.
What's next for Consumer Finance Complaints | Ranking and Analysis
We hope to make the map more interactive allowing consumers to look up the rankings for their neighborhood banks and the number of complaints recorded against them et al.