Coordination Issues in Cooperative Decentralized Decision Problems
Zhao, Yijia, Systems Engineering - School of Engineering and Applied Science, University of Virginia
Beling, Peter, Department of Systems and Information Engineering, University of Virginia
Patek, Stephen, Department of Systems and Information Engineering, University of Virginia
Decentralized control of complex engineering systems has the potential to provide the kind of security, reliability and scalability a centralized control scheme may be unable to offer in a highly dynamic setting. Much of the previous work in the area of decentralized control of a team of cooperative agents focuses on devising centrally computed policies that can be implemented in a distributed fashion to optimize team performance. In this thesis, we study decentralized decision problems each agent must formulate and solve in order to compute its own policy independently.
A significant challenge in enabling the team of cooperative agents to work together efficiently is resolving coordination dilemmas associated with the presence of multiple optimal courses of actions. We aim to resolve these coordination dilemmas without assuming the presence of a centralized decision maker, the ability to negotiate or share intentions among the team members, or a consistent internal representation of the multiagent system.
Markov decision processes and its generalizations serve as the foundation in the study of single agent control. Decentralized control of cooperative agents is often framed as a multiagent extension of MDP, or as an identical interest stochastic game. Finding a solution for both involves solving stage games that are identical interest strategic games. Coordination dilemmas arise when multiple pareto-optimal Nash equilibria exist; solving these stage games thus reduces to an equilibrium selection problem. We propose a new solution concept as an equilibrium selection rule for a class of symmetric identical interest games where players are rewarded for commonality in their actions. The solution concept is endogenously salient and operates under the principle that no arbitrary decisions are allowed. We develop a linear time heuristic that 1) is theoretically guaranteed to compute the solution concept under certain conditions; 2) is shown to be successful with overwhelming likelihood in practice.
Next, we consider a decentralized path planning problem for team Bayesian search. A team of agents is tasked with making observations in a search area where an unknown number of targets exist. Each agent must formulate and solve a decentralized planning problem to compute its future actions. This planning problem is formulated as a partially observed Markov decision problem whose objective function is evaluated based on the assumption that all agents will use the same mixed strategy policy. We propose three dynamic programming heuristics for this planning problem-each can be used by agents in a decentralized fashion to compute an individual policy. The heuristics are designed such that all will arrive at the same policy as long as they use the same heuristics. The resulting policies are evaluated empirically in two instances of the team Bayesian search problem where resolving coordination dilemmas stemming from multiple optimal courses of actions is critical.
PHD (Doctor of Philosophy)
All rights reserved (no additional license for public reuse)