Measuring and Mitigating Biases in Vision and Language Models

Wang, Tianlu, Computer Science - School of Engineering and Applied Science, University of Virginia
Ordonez-Roman, Vicente, EN-Comp Science Dept, University of Virginia

We have seen unprecedented success achieved by deep learning models in many areas of research in computer vision and natural language processing. While deep learning techniques are universally successful, they have also been criticized for carrying unwanted biases. For example, a human activity recognition model can overly correlate “man” with “coaching” and “woman” with “shopping”, and a resume filtering system can recommend more male candidates for the position of “programmer” and more female candidates for the position of “homemaker”. This dissertation tries to study the problem behind such phenomena. A majority of deep learning models require large amounts of data during the training process. These datasets are collected from the Internet and some of them are further annotated by people. As a result, datasets may reflect biases existing in our society or stereotypes from annotators. Models can inherit biases from datasets and may even amplify them. As more and more research models are being adopted in practical applications, great concerns have been raised about the potential adverse effect of various biases on societal fairness and equality. It is critical for us to be aware of how biases exist in datasets and models, and how to mitigate them.

The goal of this research is to establish metrics to measure biases in different models and further propose methods to mitigate them. More specifically, we approach the problem through three main projects. In the first project, we present a framework to measure and mitigate intrinsic biases with respect to protected variables–such as gender–in visual recognition tasks. We show that even when datasets are balanced such that each label co-occurs equally with each gender, learned models amplify the association between labels and gender, as much as if data had not been balanced. To mitigate bias amplification, we adopt an adversarial approach to effectively remove unwanted features corresponding to protected variables from intermediate representations in a deep neural network. In the second project, we discover that semantic-agnostic corpus regularities such as word frequency captured by the word embeddings negatively impact the performance of existing debiasing algorithms. We propose a simple but effective technique that purifies the word embeddings against such corpus regularities prior to inferring and removing the gender subspace in biased word embeddings. Our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches. In the third project, we focus on more general biases. We propose a clustering-based metric to measure bias without sensitive attribute annotations. We demonstrate that our metric provides consistent estimates when compared to measurements from existing metrics that leverage sensitive attribute annotations. We found this type of metric is especially useful for active learning where the iterative selection of examples may introduce significant biases. This dissertation also includes our work on exploring data dependencies and robustness issues caused by them.

The techniques proposed in this dissertation represent the first step towards bias measuring and mitigating for some vision and language tasks. With the advent of Artificial Intelligence (AI) and the proliferation of more complicated datasets, continuous efforts on helping people use technology fairly are needed. We hope the techniques presented in this dissertation improve AI models and more people can benefit from them.

PHD (Doctor of Philosophy)
Fairness, Computer Vision, Natural Language Processing
Issued Date: