Fostering Trustworthiness in Machine Learning Algorithms

Huai, Mengdi, Computer Science - School of Engineering and Applied Science, University of Virginia
Zhang, Aidong, EN-Comp Science Dept, University of Virginia

Recent years have witnessed an explosion of works that develop and apply machine learning algorithms to build intelligent learning systems (e.g., medical decision systems and self-driving cars). However, traditional machine learning algorithms mainly focus on optimizing accuracy and efficiency, and they fail to consider how to foster trustworthiness in their design. Trustworthiness reflects the degree of a user's confidence that the deployed machine learning algorithms will operate as the user expects in the face of various circumstances such as human errors, system failures, and malicious attacks. The essential characteristics at the core of trustworthiness include model transparency, robustness against malicious attacks, and privacy preservation. Without fully studying the trustworthiness of the deployed machine learning algorithms, we will face a variety of devastating social and environmental consequences. In this dissertation, we take steps to study and address the untrustworthy issues in the design of machine learning algorithms. Specifically, we first propose several model interpretation methods that can give insights on machine learning models' working mechanisms by interpreting what they have learned and hence help increase the trust in model decisions. Then, we design both offensive and defensive strategies to investigate the security vulnerabilities of machine learning algorithms to malicious attacks. In addition, we design several effective privacy-preserving mechanisms for privately sharing data and machine learning models without leaking the sensitive information. Extensive experiments are conducted and presented to demonstrate the effectiveness of the proposed methods.

PHD (Doctor of Philosophy)
All rights reserved (no additional license for public reuse)
Issued Date: