Tech Sector Employment Diversity in Silicon Valley

AIM: To understand workplace diversity in the technology sector through the lens of an observational study of employment statistics centered around the Silicon Valley tech companies.

OBJECTIVES:

  • Observe workplace diversity in terms of:
    • Gender: Male vs. Female (sex ratio)
    • Race: White/Caucasian vs. Non-White (non-white ratio)
    • Job Category: White Collar Jobs vs. Blue Collar Jobs (blue-collar ratio)
  • Draw bar graphs/pie charts of the factors:
    • Company Diversity
    • Racial Diversity
    • Gender Diversity
    • Job Role Diversity
  • Deduce summary statistics on the figures of the aforementioned criteria:
    • Mean
    • Median
    • Standard Deviation
    • Quantile Ranges
  • Draw histogram and plots of the factors’ spread:
    • Sex Ratio
    • Non-White Ratio
    • Blue-Collar Ratio
  • Generate and plot the model(s) to prove linear independence of the diversity factors.

THEORY:

  1. Sex Ratio: In this context, the sex ratio is calculated as the ratio of number of female employees to the total number of employees. This can also be used to calculate the inverted sex ratio (the ratio of number of male employees to the total number of employees).
  2. Non-White Ratio: In this context, the non-white ratio is calculated as the ratio of number of non-white employees (American-Indian/Alaskan Native, Asian, Black/African- American, Hispanic/Latino, Native Hawaiian/Pacific Islander, Multiracial) to the total number of employees. This can also be used to calculate the white ratio (the ratio of number of white employees to the total number of employees).
  3. Blue-Collar Ratio: In this context, the blue-collar ratio is calculated as the ratio of number of blue-collar employees (Craft Workers, Laborers/Helpers, Operatives, Service Workers, Technicians) to the total number of employees. This can also be used to calculate the white-collar ratio (the ratio of number of white-collar employees to the total number of employees)

    While it’s quite intuitive to imagine that the above 3 factors must be directly proportional or linearly related, this project aims to show that they are linearly independent of each other, especially relevant with the data from Silicon Valley.

DATASET METADATA:
Source: https://github.com/cirlabs/Silicon-Valley-Diversity-Data/blob/master/Reveal_EEO1_for_2016.csv (Data is available under the Open Database License)

Credits: "Reveal from The Center for Investigative Reporting." https://www.revealnews.org/svdiversity

Cleaned Working Data:

company race gender job_category count
1: 23andMe Hispanic/Latino male Executives 0
2: 23andMe Hispanic/Latino male Managers 1
3: 23andMe Hispanic/Latino male Professionals 7
4: 23andMe Hispanic/Latino male Technicians 0
5: 23andMe Hispanic/Latino male Sales Workers 0
---
4121: Sanmina Overall Totals NA Operatives 1660
4122: Sanmina Overall Totals NA Laborers/Helpers 4
4123: Sanmina Overall Totals NA Service Workers 57
4124: Sanmina Overall Totals NA Totals 5205
4125: Sanmina Overall Totals NA Managers 591


  • company : the various companies centered around Silicon Valley
    25 levels: "23andMe", "Adobe", "Airbnb", "Apple", "Cisco", "eBay", "Facebook", "Google", "HP Inc.", "HPE", "Intel", "Intuit", "LinkedIn", "Lyft", "MobileIron", "NetApp", "Nvidia", "PayPal", "Pinterest", "Salesforce", "Sanmina", "Square", "Twitter", "Uber", "View"

  • race : the race-wise distribution of employees
    8 levels: "American-Indian/Alaskan Native", "Asian", "Black/African-American", "Hispanic/Latino", "Native Hawaiian/Pacific Islander", "Overall Totals", "Multiracial", "White/Caucasian"

  • gender : the gender-wise distribution of employees
    3 levels: “male”, “female”, NA

  • job_category : the job type classifications of employees
    11 levels: "Administrative Support", "Craft Workers", "Executives", "Laborers/Helpers", "Managers", "Operatives", "Professionals", "Sales Workers", "Service Workers", "Technicians", "Totals"

Notes:

  • The data is completely from the year 2016. It would be wise to mention this as the year column was removed during the cleaning of the data.

OBSERVATIONS & CONCLUSIONS:

  1. Categorical Data


  2. Figure 1: Company Diversity Pie Chart


    Table 1: Company Diversity Table


    Figure 2: Gender Diversity Pie Chart


    Table 2: Gender Diversity Table


    Figure 3: Racial Diversity Pie Chart


    Table 3: Racial Diversity Table


    Figure 4: Job Diversity Pie Chart


    Table 4: Job Diversity Table


    Figure 5: Bar Chart for Company Diversity


    Figure 6: Bar Chart for Gender Diversity


    Figure 7: Bar Chart for Racial Diversity


    Figure 8: Bar Chart for Job Role Diversity

  3. Statistical Summaries



  4. Discrete Distributions of Sex Ratio, Non-White Ratio and Blue-Collar Jobs Ratio




  5. Linear Models to Prove Linear Independence
    1. Sex Ratio ~ Non-White Ratio
    2. Sex Ratio ~ Blue-Collar Jobs Ratio
    3. Non-White Ratio ~ Blue-Collar Jobs Ratio

Built With

  • ggplot
  • r
  • tidyverse
Share this project:

Updates