Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (21): 1-13.DOI: 10.3778/j.issn.1002-8331.2104-0432

Previous Articles     Next Articles

Overview on Reinforcement Learning of Multi-agent Game

WANG Jun, CAO Lei, CHEN Xiliang, LAI Jun, ZHANG Legui   

  1. College of Command and Control Engineering, Army Engineering University of PLA, Nanjing 210007, China
  • Online:2021-11-01 Published:2021-11-04

多智能体博弈强化学习研究综述

王军,曹雷,陈希亮,赖俊,章乐贵   

  1. 陆军工程大学 指挥控制工程学院,南京 210007

Abstract:

The use of deep reinforcement learning to solve single-agent tasks has made breakthrough progress. Since the complexity of multi-agent systems, common algorithms cannot solve the main difficulties. At the same time, due to the increase in the number of agents, taking the expected value of maximizing the cumulative return of a single agent as the learning goal often fails to converge and some special convergence points do not satisfy the rationality of the strategy. For practical problems that there is no optimal solution, the reinforcement learning algorithm is even more helpless. The introduction of game theory into reinforcement learning can solve the interrelationship of agents very well and explain the rationality of the strategy corresponding to the convergence point. More importantly, it can use the equilibrium solution to replace the optimal solution in order to obtain a relatively effective strategy. Therefore, this article investigates the reinforcement learning algorithms that have emerged in recent years from the perspective of game theory, summarizes the important and difficult points of current game reinforcement learning algorithms and gives several breakthrough directions that may solve the above-mentioned difficulties.

Key words: multi-agent, reinforcement learning, game theory

摘要:

使用深度强化学习解决单智能体任务已经取得了突破性的进展。由于多智能体系统的复杂性,普通算法无法解决其主要难点。同时,由于智能体数量增加,将最大化单个智能体的累积回报的期望值作为学习目标往往无法收敛,某些特殊的收敛点也不满足策略的合理性。对于不存在最优解的实际问题,强化学习算法更是束手无策,将博弈理论引入强化学习可以很好地解决智能体的相互关系,可以解释收敛点对应策略的合理性,更重要的是可以用均衡解来替代最优解以求得相对有效的策略。因此,从博弈论的角度梳理近年来出现的强化学习算法,总结当前博弈强化学习算法的重难点,并给出可能解决上述重难点的几个突破方向。

关键词: 多智能体, 强化学习, 博弈论