# Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations

@article{Beutel2017DataDA, title={Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations}, author={Alex Beutel and Jilin Chen and Zhe Zhao and Ed H. Chi}, journal={ArXiv}, year={2017}, volume={abs/1707.00075} }

How can we learn a classifier that is "fair" for a protected or sensitive group, when we do not know if the input to the classifier belongs to the protected group. [...] Key MethodHere, we use an adversarial training procedure to remove information about the sensitive attribute from the latent representation learned by a neural network. In particular, we study how the choice of data for the adversarial training effects the resulting fairness properties. We find two interesting results: a small amount of data isâ€¦ Expand

#### 220 Citations

Costs and Benefits of Fair Representation Learning

- Computer Science
- AIES
- 2019

It is shown that using fair representation learning as an intermediate step in fair classification incurs a cost compared to directly solving the problem, which is referred to as the cost of mistrust, and the benefits ofFair representation learning are quantified, by showing that any subsequent use of the cleaned data will not be too unfair. Expand

Mitigating Unwanted Biases with Adversarial Learning

- Mathematics, Computer Science
- AIES
- 2018

This work presents a framework for mitigating biases concerning demographic groups by including a variable for the group of interest and simultaneously learning a predictor and an adversary, which results in accurate predictions that exhibit less evidence of stereotyping Z. Expand

Transfer of Machine Learning Fairness across Domains

- Computer Science, Mathematics
- ArXiv
- 2019

This work offers new theoretical guarantees of improving fairness across domains, and offers a modeling approach to transfer to data-sparse target domains and gives empirical results validating the theory and showing that these modeling approaches can improve fairness metrics with less data. Expand

FLEA: Provably Fair Multisource Learning from Unreliable Training Data

- Computer Science, Mathematics
- ArXiv
- 2021

FLEA is introduced, a filtering-based algorithm that allows the learning system to identify and suppress those data sources that would have a negative impact on fairness or accuracy if they were used for training and is proved formally that â€“given enough dataâ€“ FLEA protects the learner against unreliable data. Expand

Learning Fair Representations via an Adversarial Framework

- Computer Science, Mathematics
- ArXiv
- 2019

A minimax adversarial framework with a generator to capture the data distribution and generate latent representations, and a critic to ensure that the distributions across different protected groups are similar provides a theoretical guarantee with respect to statistical parity and individual fairness. Expand

Imparting Fairness to Pre-Trained Biased Representations

- Computer Science
- 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
- 2020

This paper first studies the "linear" form of the adversarial representation learning problem, and obtains an exact closed-form expression for its global optima through spectral learning and extends this solution and analysis to non-linear functions through kernel representation. Expand

Impossibility results for fair representations

- Computer Science, Mathematics
- ArXiv
- 2021

It is proved that no representation can guarantee the fairness of classifiers for different tasks trained using it; even the basic goal of achieving label-independent Demographic Parity fairness fails once the marginal data distribution shifts. Expand

Discovering Fair Representations in the Data Domain

- Computer Science
- 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019

This work proposes to cast the problem ofpretability and fairness in computer vision and machine learning applications as data-to-data translation, i.e. learning a mapping from an input domain to a fair target domain, where a fairness definition is being enforced. Expand

Inherent Tradeoffs in Learning Fair Representation

- Computer Science
- ArXiv
- 2019

This paper provides the first result that quantitatively characterizes the tradeoff between demographic parity and the joint utility across different population groups and proves that if the optimal decision functions across different groups are close, then learning fair representation leads to an alternative notion of fairness, known as the accuracy parity. Expand

Learning to Ignore: Fair and Task Independent Representations

- Computer Science
- ArXiv
- 2021

Fair models and sensitive attributes can be seen as a common framework of learning invariant representations which can be used for domain adaptation, transferring knowledge and learning effectively from very few examples. Expand

#### References

SHOWING 1-10 OF 12 REFERENCES

Censoring Representations with an Adversary

- Computer Science, Mathematics
- ICLR
- 2016

This work forms the adversarial model as a minimax problem, and optimize that minimax objective using a stochastic gradient alternate min-max optimizer, and demonstrates the ability to provide discriminant free representations for standard test problems, and compares with previous state of the art methods for fairness. Expand

Learning Fair Representations

- Mathematics, Computer Science
- ICML
- 2013

We propose a learning algorithm for fair classification that achieves both group fairness (the proportion of members in a protected group receiving positive classification is identical to theâ€¦ Expand

Equality of Opportunity in Supervised Learning

- Computer Science, Mathematics
- NIPS
- 2016

This work proposes a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features and shows how to optimally adjust any learned predictor so as to remove discrimination according to this definition. Expand

Domain-Adversarial Training of Neural Networks

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2016

A new representation learning approach for domain adaptation, in which data at training and test time come from similar but different distributions, which can be achieved in almost any feed-forward model by augmenting it with few standard layers and a new gradient reversal layer. Expand

Beyond Globally Optimal: Focused Learning for Improved Recommendations

- Computer Science
- WWW
- 2017

This work offers a new technique called focused learning, based on hyperparameter optimization and a customized matrix factorization objective, which demonstrates prediction accuracy improvements on multiple datasets. Expand

The Variational Fair Autoencoder

- Mathematics, Computer Science
- ICLR
- 2016

This model is based on a variational autoencoding architecture with priors that encourage independence between sensitive and latent factors of variation that is more effective than previous work in removing unwanted sources of variation while maintaining informative latent representations. Expand

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

- Computer Science, Mathematics
- NIPS
- 2016

This work empirically demonstrates that its algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks. Expand

Domain Separation Networks

- Computer Science
- NIPS
- 2016

The novel architecture results in a model that outperforms the state-of-the-art on a range of unsupervised domain adaptation scenarios and additionally produces visualizations of the private and shared representations enabling interpretation of the domain adaptation process. Expand

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2011

This work describes and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal functions that can be chosen in hindsight. Expand

Inherent Trade-Offs in the Fair Determination of Risk Scores

- Computer Science, Mathematics
- ITCS
- 2017

Some of the ways in which key notions of fairness are incompatible with each other are suggested, and hence a framework for thinking about the trade-offs between them is provided. Expand