Sequential Decision-Making in Intelligent Multi-Agent Systems

Author: ORCID icon
Shi, Chengshuai, Electrical Engineering - School of Engineering and Applied Science, University of Virginia
Shen, Cong, EN-Elec & Comp Engr Dept, University of Virginia

Sequential decision-making models, especially multi-armed bandits (MAB) and reinforcement learning (RL) have found tremendous success in wide applications of cognitive radios, recommender systems, healthcare, and beyond. However, the majority of these previous studies are focused on single-agent scenarios, which may fail to capture many modern real-world multi-agent applications (e.g., multiple devices sharing communication resources in cognitive radio). This thesis is thus motivated to extend previous single-agent decision-making studies to their multi-agent settings, which raises new challenges in system modeling, communication strategies, and beyond. In particular, this thesis focuses on two core topics in designing sequential decision-making algorithms for intelligent multi-agent systems: how to communicate and how to collaborate.

First, communication is one unique component in multi-agent systems compared with their single-agent counterparts. This thesis investigates this direction in providing efficient and robust information-sharing mechanisms. In particular, focusing on a decentralized multi-player MAB system, novel communication tools are developed, e.g., adaptive quantization, and error-correction coding. Besides communication, the collaboration strategy is also the key to enabling effective multi-agent systems. In this part, this thesis presents a line of works on federated MAB that extends the core principles of federated learning to MAB, and in particular, summarizes a modularized design principle for federated contextual bandits.

With these advances, this thesis deepens the understanding of decision-making designs in multi-agent systems and provides fundamental insights for future developments.

PHD (Doctor of Philosophy)
Multi-armed Bandits, Reinforcement Learning, Multi-agent System
Issued Date: