Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (16): 60-62.DOI: 10.3778/j.issn.1002-8331.2009.16.016

• 研究、探讨 • Previous Articles     Next Articles

Multiagent Q-learning based on ant colony algorithm and roulette algorithm

MENG Xiang-ping1,WANG Sheng-bin2,WANG Xin-xin2   

  1. 1.Department of Electrical Engineering,Changchun Institute of Technology,Changchun 130012,China
    2.Department of Computer Engineering,Northeast Dianli University,Jilin 132012,China
  • Received:2008-04-11 Revised:2008-06-18 Online:2009-06-01 Published:2009-06-01
  • Contact: MENG Xiang-ping

基于蚁群算法和轮盘算法的多Agent Q学习

孟祥萍1,王圣镔2,王欣欣2   

  1. 1.长春工程学院 电气与信息学院,长春 130012
    2.东北电力大学 信息工程学院,吉林 132012
  • 通讯作者: 孟祥萍

Abstract: Authors present a novel Multiagent Reinforcement Learning Algorithm based on Q-Learning,ant colony algorithm and roulette algorithm.In reinforcement learning algorithm,when the number of agents is large enough,all of the action selection methods will be failed:the speed of learning is decreased sharply.Besides,as the Agent makes use of the Q value to choose the next action,the next action is restrainted seriously by the high Q value,in the prophase.So,authors combine the ant conlony algorithm,roulette algorithm with Q-learning,hope that the problems will be resolved with the algorithm proposed.At last,the theory analysis and experiment result both demonstrate that the improved Q-learning is feasible and increases the learning efficiency.

Key words: multiagent reinforcement learning algorithm, ant colony algorithm, roulette algorithm

摘要: 提出了一种新颖的基于Q-学习、蚁群算法和轮盘赌算法的多Agent强化学习。在强化学习算法中,当Agent数量增加到足够大时,就会出现动作空间灾难性问题,即:其学习速度骤然下降。另外,Agent是利用Q值来选择下一步动作的,因此,在学习早期,动作的选择严重束缚于高Q值。把蚁群算法、轮盘赌算法和强化学习三者结合起来,期望解决上述提出的问题。最后,对新算法的理论分析和实验结果都证明了改进的Q学习是可行的,并且可以有效地提高学习效率。

关键词: 多Agent强化学习算法, 蚁群算法, 轮盘赌算法