Computer Engineering and Applications ›› 2008, Vol. 44 ›› Issue (26): 62-64.DOI: 10.3778/j.issn.1002-8331.2008.26.018

• 理论研究 • Previous Articles     Next Articles

Cooperative multi-agent reinforcement learning based on quantum computing

TAN Wan-yu1,WANG Jian-zhong2,MENG Xiang-ping1   

  1. 1.School of Electrical Engineering and Information Technology,Changchun Institute of Technology,Changchun 130012,China
    2.School of Information Engineering,Northeast Dianli University,Jilin 132012,China
  • Received:2007-11-08 Revised:2008-01-21 Online:2008-09-11 Published:2008-09-11
  • Contact: TAN Wan-yu

基于量子计算的多Agent协作学习算法

谭万禹1,王建忠2,孟祥萍1   

  1. 1.长春工程学院 电气与信息工程学院,长春 130012
    2.东北电力大学 信息工程学院,吉林 132012
  • 通讯作者: 谭万禹

Abstract: Due to the interactions among the agents in the cooperative multi-agent systems,multi-agent learning problem complexity can rise rapidly with the number of agents or their behavioral sophistication.In order to converge to desirable equilibrium,agents generally require sufficient exploration of strategy space and coordinate their policies to achieve optimal equilibrium.A novel cooperative multi-agent learning method is proposed based on quantum theory and reinforcement learning.This method not only coordinates agents’ behaviors using quantum entanglement and helps agents make action selection under quantum superposition,but also adopts Grover’s searching algorithm which can speed up learning.This method also makes a good tradeoff between exploration and exploitation using probability characteristics of quantum theory.The results of simulated experiments show that quantum theory can be effectively used to cooperative multi-agent reinforcement learning.

Key words: multi-agent system, cooperative, quantum computing, Q-learning, equilibrium

摘要: 针对多Agent协作强化学习中存在的行为和状态维数灾问题,以及行为选择上存在多个均衡解,为了收敛到最佳均衡解需要搜索策略空间和协调策略选择问题,提出了一种新颖的基于量子理论的多Agent协作学习算法。新算法借签了量子计算理论,将多Agent的行为和状态空间通过量子叠加态表示,利用量子纠缠态来协调策略选择,利用概率振幅表示行为选择概率,并用量子搜索算法来加速多Agent的学习。相应的仿真实验结果显示新算法的有效性。

关键词: 多Agent系统, 协作, 量子计算, Q-学习, 均衡解