Mitigating Spurious Bias for Learning Robust Machine Learning Models

Author: ORCID icon orcid.org/0000-0002-1287-4931
Zheng, Guangtao, Computer Science - School of Engineering and Applied Science, University of Virginia
Advisors:
Zhang, Aidong, EN-Comp Science Dept, University of Virginia
Abstract:

Modern machine learning (ML) models have shown strong empirical performance across a wide range of domains. However, they tend to use spurious correlations between target labels and non-essential spurious attributes for predictions, leading to right predictions for the wrong reasons. For example, an image classifier may identify objects based on frequently co-occurring backgrounds rather than the defining features of the objects. This phenomenon, known as spurious bias, can significantly degrade model performance under distribution shifts, where the learned spurious correlations no longer hold, limiting the model’s reliability and generalizability in real-world scenarios. This dissertation focuses on spurious bias mitigation for learning ML models that can generalize reliably and robustly in new environments with unknown or known distribution shifts. We propose novel methods tailored for out-of-distribution generalization and generalization under subpopulation shifts, addressing unknown and known distribution shifts, respectively. For out-of-distribution generalization, where target data distributions are unknown during training, we propose to synthesize spurious attributes, such as novel image styles, to explore new data distributions. We design learning algorithms that integrate data exploration into the learning of robust and generalizable features, and demonstrate their effectiveness in challenging settings such as few-shot learning and single domain generalization. Under subpopulation shifts, where proportions of certain data groups are known to vary between training and testing but group annotations are not generally accessible, models may inadvertently rely on spurious attributes in certain data groups for predictions. To address this, we propose multimodal-assisted methods to detect and mitigate spurious bias using pre-trained vision-language models. We further propose fully self-guided methods that leverage internal states of a model for automatic spurious bias detection and mitigation. By directly addressing spurious bias, this dissertation advances the development of robust and trustworthy ML models that make right predictions for the right reasons, improving their reliability across diverse environments.

Degree:
PHD (Doctor of Philosophy)
Keywords:
spurious correlations, robustness, distribution shifts
Language:
English
Issued Date:
2025/06/17