Online Archive of University of Virginia Scholarship
Distributed Data Learning with Knowledge Transmission Topology206 views
Author
Li, Yifan, Statistics - Graduate School of Arts and Sciences, University of Virginia
Advisors
Tang, Xiwei, AS-Statistics (STAT), University of Virginia
Abstract
In today's digital era, vast amounts of data, such as hospital health records and individual device usage data, are stored in diverse locations. These distributed datasets, while essential for preserving individual privacy and managing data sizes, present unique challenges for comprehensive data analysis under the constraints arising from data sharing and aggregation. In this thesis, we investigate statistical modeling in a distributed data system along with some information transmission structures. In Chapter 2, we study a penalization-based model integration problem with a network constraint. We propose a network sparsification method that significantly reduces communication across data sites. This method is computationally more efficient while preserving estimation efficiency. In Chapter 3, we develop a Decentralized Federated Learning framework without sharing or aggregating data. We explore different knowledge-sharing mechanisms between sites, with the goal of building predictive models for each individual site without a central server. At the same time, we examine how different transmission topologies affect the efficiency of communication.
Li, Yifan. Distributed Data Learning with Knowledge Transmission Topology. University of Virginia, Statistics - Graduate School of Arts and Sciences, PHD (Doctor of Philosophy), 2024-04-30, https://doi.org/10.18130/qg4q-0k67.