Inspiration

In visible Women: Data Bias in a World Designed for Men by Caroline Criado Pérez

What it does

How we built it

I first started with a video of Caroline Criado Pérez who wrote the book Invisible Women, in which she gives real-life cases where the gender data gap has seriously affected women’s lives. For example, the fact that in most drug tests, male animals are used with excuses that “the menstrual circles will affect the findings”, which leads to the inaccuracy in the prescription of women. There are heart arrhythmia drugs that are more likely to trigger a heart attack in the first half of a woman’s menstrual circle. In the end, she points out that the medical, hiring, and public transportation data in real like are helplessly biased towards one gender, not to mention having an algorithm that receives data and gets better and better at being biased.

In the article Examining How Gender Bias is Build into AI, according to Manasi and Dr. Panchanadeswaran, using a data set that lacks diversity and information on certain demographic categories like women of color, might skew results. They also say that research has repeatedly shown that AI models are often trained on male-centric data, which in turn yields results that misidentify women and people of color.

Done reading more scholarly articles related to the topic, I summed up a few cases that indicate the negative outcomes of using biased machine learning:

Google translate: entering a gender-neutral sentence in Hindi such as “Vah ek doktar hai” (“That is a doctor”) gets translated in English as “He is a doctor.” Similarly, “Vah ek nurse hai”(“That is a nurse”) gets translated in English as “She is a nurse.” While, it is true that there are more male doctors than female doctors due to rates of burnout (Poorman2018). The algorithm is interpreting it as a stronger association of males with doctors and females with nurses. Thus, the stereotype that doctors are male and nurses are female is perpetuated. job search algorithms as well. Incorrectly learning a social disparity that women are overrepresented in lower paying jobs as women’s preference for lower paying jobs, the algorithms recommend lower paying jobs to women than to men (Bolukbasi et al. 2016).

If a firm uses reviews containing gender bias to design products and promotions, recommend products to consumers, determine how to portray individuals in a commercial, segment consumers etc. then it could lead to biased recommendations, messages or insensitive portrayals. Specifically, recommendation using biased reviews would result in learning and using consumer vulnerability against female consumers. Moreover, if these reviews are used to provide recommendations in the marketplace, it would result in women being recommended less career-oriented products e.g. less online courses, job advertisements, or even less paying jobs (e.g., Bolukbasi et al. 2016). Amazon decided to use an AI-powered recruiting tool in 2015. The automated tool was trained to identify promising applicants by observing patterns in resumes submitted to Amazon over a 10-year period. It then would assign job candidates scores ranging from one to five stars. However, most of the resumes came from men given that the tech industry is notoriously male-dominated. Based on the data it used, the system taught itself to prefer male candidates over female ones, and penalized resumes that included words like “women’s”, such as “women’s chess club captain”. It also downgraded graduates of two unnamed all-women’s colleges and gave preference to what Reuters referred to as “masculine language” and the use of stronger verbs such as “executed” or “captured.”

……

Based on the findings, most of the critical voices focus on how the data gets into machine learning is skewed towards males in the first place, and the algorithms keep learning data that humans feed them. There are voices advocating unbiasing the ml system, such as augmenting the training data by replacing the gendered pronouns and words with the opposite gender and blanking out names, so that the training data becomes gender neutral and doesn’t relate gender characteristics with certain names. However, it hasn’t come to the majority of people’s mind, especially those who design and develop the system.

Since there are already plenty of videos explaining gender bias in machine learning, I decided to do something different, more like telling a story from the perspective of a machine learning model. It is innocent, trying to defend itself from the outside criticism yet feeling confused and down. The narrative should be calm, while the audience can feel the indifference and helplessness in the tone depending on how they perceive the video.

For the visual, I wish to find a balance between informative and aesthetic. I don’t want it to become an educational short video, rather I hope the audience can feel the emotion hidden in lines. A head model was used throughout the shooting to indicate the machine learning model, with black and white movie clips in the background. Besides the information I concluded above, I also collected plenty of online images and generated a few using Stable Diffusion, an ai generative tool.

I name the project The Default to indicate that machine learning has learnt that male is the default gender of human beings based on the millions of information it received and it chooses to lean towards the default image that it assumes in each and every decision making.

Challenges we ran into

Accomplishments that we're proud of

What we learned

What's next for The Default

In general, I very much enjoyed the process of doing research and producing a video related to both the course material and what I am passionate about. I learned a lot and received genuine feedbacks from my peers and professors. I wouldn’t say the video project is well polished as I believe it only touches upon a narrow aspect of the issue and there is definitely my personal bias in it even I tried to avoid as much as possible. I would very much like to turn the project into something more intuitive and interactive through a website or other medium if I have time.

Built With

  • stablediffusion
Share this project:

Updates