Learning to Cooperate with Unknown Agents in the Hanabi Challenge

Chen, Yizhen, Computer Science - School of Engineering and Applied Science, University of Virginia
Xu, Haifeng, EN-Comp Science Dept, University of Virginia

In recent works on multi-agent reinforcement learning, more and more researchers are looking at Hanabi Game, a board card game. Unlike previous popular studies on Go, Poker, and Atari, Hanabi focuses more on cooperation between players. Zero-sum games like Poker and Go are more straightforward for the agent to learn and understand the opponents. In a cooperative game like Hanabi, players need to exchange information frequently, and in addition, the information exchanged in the Hanabi game is imperfect. Hanabi game has become a new benchmark during the last two years. Cooperative games like Hanabi can teach AI to communicate more efficiently with other players, including human players. Such an environment can further enable AI to learn to communicate and cooperate with humans in the real world. There is a growing number of researchers focusing on the Hanabi challenge. However, most previous works still focus on learning agents in the self-play setting, with little focus on the team ad-hoc setting. We proposed a new multi-agent RL method and new metrics to measure the performance of each agent in the team ad-hoc setting. This thesis focuses on some of the approaches in which agents can improve their performance in team ad-hoc settings and comprehensively evaluate these approaches.

MS (Master of Science)
Reinforcement Learning, Multi-Agent Learning, Cooperation, Meta-Learning
All rights reserved (no additional license for public reuse)
Issued Date: