Explanations for Multi-Agent Reinforcement Learning

Boggess, Kayla, Computer Science - School of Engineering and Applied Science, University of Virginia
Feng, Lu, EN-Comp Science Dept, University of Virginia
There has been a significant increase in multi-agent reinforcement learning (MARL) systems over the last several years for exciting applications. Yet, when functioning as black boxes, these systems can cause user misunderstanding and misuse since users do not always know when or why agents perform certain actions. Generating explanations regarding agent decisions is crucial as it improves system transparency, increases user satisfaction, and facilitates human-agent collaboration. Yet, the current work on explainable reinforcement learning (xRL) focuses mostly on the single-agent setting. So, there is a general need to generate methods for policy summarization to explain agents’ global behaviors under a given MARL policy, as well as language explanations to answer user queries about agents’ local decisions.
This dissertation will focus on generating summarizations and explanations for MARL. It explores two main questions. First, how do we generate summarizations and explanations for centralized MARL? We will generate explanations for centralized multi-agent reinforcement learning, treating all agents’ actions as a single joint action. We will create global summaries and query-based explanations to address questions like when, why, and what actions are taken in specific states or conditions. Additionally, temporal explanations will clarify the feasibility of plans over time. Finally, we will assess the effectiveness of all methods through computational evaluations and user studies. Secondly, how do we generate summarizations and explanations for decentralized MARL? We will produce explainable methods for decentralized reinforcement learning, where agents are given their own individual policies. We will provide both global policy summaries and query-based insights. Additionally, we will assess the effectiveness of all methods through computational evaluations and user studies.
First, chapter 2 discusses the existing work surrounding explainable MARL. Then, chapters 3 and 4 present the main contributions of the work regarding centralized MARL. Chapter 3 provides information on the developed summarization and explanation methods. While chapter 4 presents work to generate contrastive temporal explanations. The contributions of the work regarding decentralized MARL are presented in chapters 5 and 6. These chapters show the methods developed for summarizations and explanations, respectively. Finally, chapter 7 summarizes those contributions and broader impacts.
PHD (Doctor of Philosophy)
Explainable AI, Multi-Agent Reinforcement Learning, Artificial Intelligence, Explainability
English
2025/04/10