计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (9): 19-29.DOI: 10.3778/j.issn.1002-8331.2202-0297

• 热点与综述 • 上一篇    下一篇

智能博弈对抗中的对手建模方法及其应用综述

魏婷婷,袁唯淋,罗俊仁,张万鹏   

  1. 国防科技大学 智能科学学院,长沙 410073
  • 出版日期:2022-05-01 发布日期:2022-05-01

Survey of Opponent Modeling Methods and Applications in Intelligent Game Confrontation

WEI Tingting, YUAN Weilin, LUO Junren, ZHANG Wanpeng   

  1. College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China
  • Online:2022-05-01 Published:2022-05-01

摘要: 智能博弈对抗一直是人工智能研究的热点。在博弈对抗环境中,通过对对手进行建模,可以推测敌对智能体动作、目标、策略等相关属性,为博弈策略制定提供关键信息。对手建模方法在竞技类游戏和作战仿真推演等领域的应用前景广阔,博弈策略的制定必须以博弈各方的行动策略为前提,因此建立一个准确的对手行为模型对于预测其意图尤其重要。从内涵、方法、应用三个方面,阐述了对手建模的必要性,对现有建模方式进行了分类;对基于强化学习的预测方法、基于心智理论的推理方法和基于贝叶斯的优化方法进行了梳理与总结;以序贯博弈(德州扑克)、即时策略博弈(星际争霸)和元博弈为典型应用场景,分析了智能博弈对抗过程中的对手建模的作用;从有限理性、策略欺骗性和可解释性三个方面进行了对手建模技术发展的展望。

关键词: 对手建模, 不完美信息, 行为预测, 深度强化学习, 递归推理, 元博弈

Abstract: Intelligent game confrontation has always been the focus of artificial intelligence research. In the game confrontation environment, the actions, goals, strategies, and other related attributes of agent can be inferred by opponent modeling, which provides key information for game strategy formulation. The application of opponent modeling method in competitive games and combat simulation is promising, and the formulation of game strategy must be premised on the action strategy of all parties in the game, so it is especially important to establish an accurate model of opponent behavior to predict its intention. From three dimensions of connotation, method, and application, the necessity of opponent modeling is expounded and the existing modeling methods are classified. The prediction method based on reinforcement learning, reasoning method based on theory of mind, and optimization method based on Bayesian are summarized. Taking the sequential game(Texas Hold’em), real-time strategy game(StarCraft), and meta-game as typical application scenarios, the role of opponent modeling in intelligent game confrontation is analyzed. Finally, the development of adversary modeling technology prospects from three aspects of bounded rationality, deception strategy and interpretability.

Key words: opponent modeling, imperfect information, behavior prediction, deep reinforcement learning, recursive reasoning, meta-game