Inspiration

I was inspired to develop a tool to help train more ethical AI after seeing the results of some biased models, such as the depixelizer model which transformed a pixelated picture of Obama into a white-looking person. That inspired me to leverage my hobby for coding and open-sourcing my work to help develop tools to build less-biased AI.

What it does

The AdversarialDebiasing class from IBM's AI Fairness 360 open-source Python library has been refactored into a more user-friendly and flexible version using PyTorch. This class can now dynamically instantiate a classifier with user-defined artificial neural network architectures and automatically generate a corresponding adversary model which will help the classifier be trained with less bias. Users can also pass pre-instantiated models in place of the default. This enables AI and ML engineers to train any binary classification model within an adversarial debiasing framework, as well as provide a starting-off point for further customization of this process. A future version will enable multi-class predictions for both the classifier and adversary models. You can click here to see the source code and the new demo notebook in Google Colab

How I built it

I created a predefined ANN, classifier, and adversary models that are defined outside the main AdversarialDebiasing class and as mentioned above can be modified or swapped by users. The default versions are based on the original hard-coded TensorFlow implementation. This implementation also involved a staircase-wise exponential decay scheduler for the learning rate, which I implemented by adjusting an existing version that had been published on GitHub Gist. By replicating the ANN up to but not including the final activation from the classifier within the adversary model, the classifier weights and biases can be embedded into the adversary at each global step of the fitting loop to force the adversary to try and optimize them according to its objective of discerning the protected attribute value. The gradients of these layers are then projected back to the classifier using the method described in the original white paper such that the classifier can both maximize its accuracy while also trying to minimize the adversary's accuracy. Using the original notebook which demonstrated the TensorFlow version and changing the parameters to match the new required inputs, I was able to confirm that the code is working as it should with the AI Fairness 360 development team. The refactored code can be reviewed by clicking here.

Challenges I ran into

Understanding both the gradient projection process defined in the white paper, as well as the original TensorFlow implementation of this process, was the main challenge I ran into. All other challenges could be solved by referring to community questions that had been previously answered online (I consider ptrblck as an unknowing contributor to this project thanks to all his answers on the PyTorch discussion board). After several discussions with my peers who were helping me throughout this project (Ryan Khurana and Adam Resnick), I was able to figure out how to implement the same process in PyTorch. An AI researcher from IBM who worked on the original TensorFlow implementation has confirmed that the code is working correctly.

Accomplishments that I'm proud of

I'm proud of having successfully refactored TensorFlow code into PyTorch code despite my limited experience with either library. I am also proud that the AI Fairness 360 development team is interested in integrating my solution into their library and excited by the potential value that AI developers like myself could obtain from leveraging my code.

What I learned

I learned that PyTorch is an incredibly powerful and flexible tool for AI and ML engineering and development in Python and that it can be effectively integrated with other Python libraries. I also learned a lot about the theories and conceptual frameworks related to the development of more ethical AI, as well as how to leverage PyTorch to make such efforts more seamless for users of libraries such as IBM's AI Fairness 360.

What's next for Torchbearer

I plan to make further improvements to the code to enable greater flexibility and after thorough testing, I will coordinate with the IBM researchers who maintain the AI Fairness 360 library to commit my implementation into their master repository. As I currently work as a data science consultant, I plan to leverage this project to provide proof-of-concept projects for our clients and help hone their best-practices with respect to the development of more ethical AI. My former manager from my HSBC internship has already expressed interest in this. It is my hope to spend the 30 minutes I would get with the PyTorch team if I win to go over the code and discuss ways I can help make it more flexible and accessible for developers around the world.

Built With

Share this project:

Updates