An overview of machine fairness frameworks

Leonardo Pinheiro
5 min readAug 10, 2020
Photo by Jake Schumacher on Unsplash

Introduction

Unfair and discriminatory behavior is an unsettling topic in our current society. We are dealing with cases of police brutality in the U.S., gender pay gaps, and lack of diversity in leadership committees in major companies in a struggle that seems to be part of every major aspect of our current lives. Because discriminatory behavior is so pervasive it is no surprise that many datasets used to train machine learning algorithms contain some form of discriminatory bias in them. Datasets are generated by societal and organizational processes and all the discriminatory behavior that happens on those processes ends up encoded in the data, that is just the natural consequence.

In recent years, discriminatory behavior by AI models became the headline of many news articles. From xenophobic tweets from a Microsoft chatbot to Amazon’s hiring gender bias fiasco, it looks like AI discriminatory behavior is becoming as prominent as the institutional bias we already deal with. But luckily the AI community has stepped up to address the issue and both researches on AI fairness and the development of fairness tools have exploded recently.

While there are a lot of good sources to understand how bias is manifested in data and what theoretical research has been done to address it, practical frameworks to address unfairness are less known. In this post we aim to address this gap by giving a brief overview of the most popular frameworks for machine fairness up to date.

Fairness Frameworks for Machine Learning

1. Aequitas

The first tool in our list is Aequitas. It defines itself as an open-source bias audit toolkit for data scientists, machine learning researchers and policy makers and it can be used either as a web app, from the command line or as a python library. With Aequitas you can generate a bias report that will provide information about the presence of discrimination in a dataset based on a set of predefined metrics and reference groups.

Audit pipeline using Aequitas

Instead of providing algorithms to manage unfairness, Aequitas works as a risk assessment tool that can help decision-makers make more informed use of models with respect to discrimination risk.

2. IBM AI Fairness 360

AI Fairness 360 is a project from the LF AI Foundation supported by IBM Research. It is designed to be an extensible open-source toolkit that can help users examine, report, and mitigate discrimination and bias in machine learning models throughout the AI application lifecycle.
This solution is a comprehensive tool providing not only a set of fairness metrics for both datasets and models, but also algorithms to mitigate bias in datasets, through data transformations, and to mitigate bias in models with algorithms such as the prejudice remover. It includes many state-of-art models and comes with many tutorials and an online demo app.

IBM AI Fairness 360 demo web app. It can be tested on the Compas dataset, the German credit scoring, and on Adult census income.

3. Fairlearn

Developed with support from Microsoft, Fairlearn is an open-source toolkit that also aims to help data scientists to better assess and improve the fairness of their AI systems. It consists of both an interactive visualization dashboard (based on Jupyter notebooks) to visualize fairness performance metrics and a set of algorithms for unfairness mitigation. The algorithms made available through Fairlearn support the different notions of fairness such demographic parity, equalized odds, and positive/negative rate parity.

Fairlearn visualization dashboard.

4. Themis

Themis is a testing-based approach for measuring discrimination in a software system. It can measure both group discrimination and causal discrimination. Differently from other assessment tools in this list, Themis performs hypothesis testing, such as “does changing a person’s race affect whether the system recommends giving that person a loan?”. The authors of the methodology claim that this approach measures discrimination more accurately than prior work that generally focused on identifying differences in model output distributions, correlations, or mutual information between inputs and outputs.

5. Audit-AI

This tool, developed by pymetrics, consists of an open-sourced bias testing for generalized machine learning applications. Audit-AI determines whether groups are different based on theoretical significance level or practical significance (whether a difference is large enough to matter on a practical level) with support to a range of statistical tests. It also offers tools to check for differences over time or across different regions, using the Cochran-Mantel-Hanzel test, a common test in regulatory circles.

Statistical bias testing using Audit-AI

6. IntrepretML

While most of the previous tools focus on pre-processing and in-processing algorithms for dealing with bias on datasets, post-processing techniques that focus on analyzing and changing model outputs is usually very helpful against AI unfairness. One valuable tool of this kind is model interpretability, the focus of the package we introduce now.

InterpretML is a library to train interpretable glassbox models and explain blackbox systems. It helps you understand both the model’s global behavior and understand the reasons behind individual predictions. The library includes Explainable Boosting Machines (EBM), an interpretable model developed by Microsoft Research that uses modern boosting techniques like gradient boosting but produces lossless explanations that can be edited by domain experts. Other techniques supported by the package include LIME and SHAP.

Performance comparison between EBM and other models. It provides performance competitive with state-of-art algorithms such as XGBoost while providing global model interpretability.

Conclusion

Machine fairness is a very active area of research and development right now. With unfairness scandals affecting most of the big tech companies, it is likely that investment in the field will remain strong and new techniques will keep improving. If you’re dealing with sensitive data and unfair treatment is a possible concern in your application, consider giving one of these tools a shot help us create a fairer world.

--

--