I wanted to expand my Single University Salary Project to analyze differences in total salary between males and females of various professions from 33 of the top 50 public universities.
Overview of the Interface
A series of 33 sunburst diagrams, each of which represent the 33 universities being analyzed. At the center of the sunburst diagram is the university's logo. The inner layer (colored brown) breaks the general university into the various positions represented in the data that was obtained. The number of people in that position is proportional to the radial width of the arc. The outer layer (colored pink and blue) breaks down the first layer even further into the number of males and females employed in the position. The number of people in that position is once again proportional to the radial width of the arc, and the height of each arc correlates to the total salary of each respective gender in that position, with the maximum height being the maximum total salary of any one gender across the entire school and the minimum being zero.
Challenges I ran into
At the beginning of this project, I did not realize how hard it was going to be to obtain all of the universities' data. Although some universities provided their data in easy to download excel or pdf files, for most universities I had to scrape the data directly off each of the websites with the salary data, either by using nightmare.js or directly scraping the data through the Google Chrome or Firefox developer tool console.
What's next for Multi-University Salary Data Project
Making the interface more user friendly, and adding the source links for all the data next to each sunbursts.
Although I recognize that not everyone can be generically placed in the rigid categories of "male" and "female", I determined gender in such a way in that it would be hard to place anyone in a category other than the two provided above. Gender, in most cases, was determined by programming a function to search for the given name in a comprehensive database containing the frequency of a certain name being male or female. In all cases where this method was not used, the gender had been provided in the data obtained from the school.
This project is not currently open for public use.