GenderBiasGPT

GenderBiasGPT is a tool designed to detect gender bias in text inputs, specifically for non-English languages like Vietnamese and Hindi. This model assesses whether a sentence is biased toward a male or female perspective and provides a bias score ranging from 0 to 1.

Inspiration

  • We notice the word embedding libraries in English are biased, as the vector of "Man" is as close to "Programmer" as the vector of "Woman" to Homemaker.
  • We want to leverage how language models often reflect Western societal stereotypes to develop an algorithm that detects gender bias using our bias score.
  • Our inspiration also comes from the need to detect gender bias in languages other than English. While there are some models available to analyze bias in English text, few tools are dedicated to regional languages. GenderBiasGPT fills this gap by providing a model that identifies gender bias in sentences written in Vietnamese or Hindi.

What it Does

GenderBiasGPT evaluates text inputs in Vietnamese or Hindi and calculates their cosine similarity with the gender_direction vector, indicating the level of s bias toward a male or female perspective. This score provides insight into how strongly a sentence may be biased in a particular direction.

How We Built It

Based on these two papers, Estimating Gender Bias in Sentence Embeddings.pdf and Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings - which were conducted in English, we reverse-engineered and figured out how to apply the algorithm on Vietnamese and Hindi.

Our metrics called bias_score is based on 4 elements:

  • cosine similarity between two vectors x and y - cos(x,y)
  • a gender direction in the vector space that capture all gender information - D
  • a list of gendered words - L
  • the semantic importance of a word - I_w

The formula for bias score is the sum of the cosine similarity of each word with the gender direction multiply by its importance in the sentence

Challenges We Faced

  • Finding an accurate language model for tokenization and creating embedding for Vietnamese and Hindi is hard, we have to try multiple models to get the best accuracy
  • Each language requires a distinct approach to capture the nuances that indicate gender bias.

Accomplishments We're Proud Of

  • Successfully defining a gender subspace to analyze bias.
  • Achieving an accuracy of around 70% in bias detection.
  • Developing an interactive output that displays the bias score in a user-friendly way.

What We Learned

We explored the nuances of implementing a gender bias model, especially in adapting concepts from English-language models to regional languages. Experimenting with various language models helped us understand the complexities of multi-language bias detection and how important the diversity in developing technologies is.

What's Next for GenderBiasGPT

  • Scaling to other languages and adding support for additional regional dialects.
  • Expanding detection to other forms of bias, including social, political, and racial biases.
  • Working to increase accuracy and enhance the user experience.

GenderBiasGPT is an exciting first step toward making bias detection tools accessible for a wider range of languages and perspectives. We look forward to expanding and improving our model!


Setup Guide

Import the language embedder that you want

from from BiasDetectionHelper import VietnameseEmbedder, HindiEmbedder

Initialize the embedder

vietnamese_embedder = VietnameseEmbedder()

Get the gender bias score of your sentence Code:

sentence = "Một đàn ông xòe ra hai cái cánh"
bias_score = vietnamese_embedder.get_gender_bias_score_of_sentence(sentence)
print(bias_score)

Output:

{'female_bias_score': 0.13118663953125884,
 'male_bias_score': -0.08155740845883201,
 'bias_tokens': {
           'một': {'cosine_similarity': 0.13598315,      'word_importance': 0.20386893917965399},
           'đàn_ông': {'cosine_similarity': -0.4898948,    'word_importance': 0.16647943035338808},
            'x@@': {'cosine_similarity': 0.16068122,   'word_importance': 0.10408949010868583},
            'ò@@': {'cosine_similarity': 0.19100048,   'word_importance': 0.10127540317761463},
            'e': {'cosine_similarity': 0.17239621,   'word_importance': 0.07683502016518115},
            'ra': {'cosine_similarity': 0.1331513,   'word_importance': 0.07592466319248105},
            'hai': {'cosine_similarity': 0.08441967,   'word_importance': 0.10129592758936277},
            'cái': {'cosine_similarity': 0.21521486,   'word_importance': 0.09085246189998558},
            'cánh': {'cosine_similarity': 0.20075066,   'word_importance': 0.07937866433364689}
     }
}

Information The backend is built with Flask, the frontend is built on streamlit while the bias score is generated by a regression model that uses cosine similarity to evaluate gender bias in the text.


Links


Built With

  • cosine-similarity
  • python
  • streamlit
  • transformer
  • word-embeddings
Share this project:

Updates