Online Archive of University of Virginia Scholarship
Scalable and Generalizable Optimization Methods for Deep and Federated Learning53 views
Author
Sun, Jianhui, Computer Science - School of Engineering and Applied Science, University of Virginia0000-0003-0032-3646
Advisors
Zhang, Aidong, EN-Comp Science Dept, University of Virginia
Abstract
Accelerating convergence and closing the generalization gap are two fundamental goals in deep learning optimization, particularly as models grow larger and training becomes increasingly distributed. This dissertation advances both theoretical understanding and practical algorithm design along two main axes: improving generalization in stochastic optimization and developing efficient federated learning (FL) algorithms under realistic constraints. In the first theme, I develop a general PAC-Bayesian framework for analyzing the generalization behavior of a broad class of stochastic algorithms. This tool enables hyperparameter-dependent generalization bounds, facilitates automated hyperparameter tuning, and informs the design of new algorithms with provably improved generalizability. This framework applies generally across various learning paradigms—including distributed training, adversarial training, and transfer learning. In the second theme, I introduce several novel algorithms that provably mitigate both statistical and system heterogeneity. I propose FL algorithms for solving complex nested or multi-level objectives, many of them obtain the first or best known convergence guarantees. These algorithms advance the theoretical foundation of FL and offer practical benefits for real-world decentralized systems.
Sun, Jianhui. Scalable and Generalizable Optimization Methods for Deep and Federated Learning. University of Virginia, Computer Science - School of Engineering and Applied Science, PHD (Doctor of Philosophy), 2025-07-27, https://doi.org/10.18130/rxep-j131.