基于蚁群算法和轮盘算法的多Agent Q学习

doi:10.3778/j.issn.1002-8331.2009.16.016

计算机工程与应用 ›› 2009, Vol. 45 ›› Issue (16): 60-62.DOI: 10.3778/j.issn.1002-8331.2009.16.016

基于蚁群算法和轮盘算法的多Agent Q学习

孟祥萍¹,王圣镔²,王欣欣²

1.长春工程学院电气与信息学院，长春 130012
2.东北电力大学信息工程学院，吉林 132012

收稿日期:2008-04-11 修回日期:2008-06-18 出版日期:2009-06-01 发布日期:2009-06-01
通讯作者: 孟祥萍

Multiagent Q-learning based on ant colony algorithm and roulette algorithm

MENG Xiang-ping¹,WANG Sheng-bin²,WANG Xin-xin²

1.Department of Electrical Engineering，Changchun Institute of Technology，Changchun 130012，China
2.Department of Computer Engineering，Northeast Dianli University，Jilin 132012，China

Received:2008-04-11 Revised:2008-06-18 Online:2009-06-01 Published:2009-06-01
Contact: MENG Xiang-ping

摘要/Abstract

摘要： 提出了一种新颖的基于Q-学习、蚁群算法和轮盘赌算法的多Agent强化学习。在强化学习算法中，当Agent数量增加到足够大时，就会出现动作空间灾难性问题，即：其学习速度骤然下降。另外，Agent是利用Q值来选择下一步动作的，因此，在学习早期，动作的选择严重束缚于高Q值。把蚁群算法、轮盘赌算法和强化学习三者结合起来，期望解决上述提出的问题。最后，对新算法的理论分析和实验结果都证明了改进的Q学习是可行的，并且可以有效地提高学习效率。

关键词: 多Agent强化学习算法, 蚁群算法, 轮盘赌算法

Abstract: Authors present a novel Multiagent Reinforcement Learning Algorithm based on Q-Learning，ant colony algorithm and roulette algorithm.In reinforcement learning algorithm，when the number of agents is large enough，all of the action selection methods will be failed：the speed of learning is decreased sharply.Besides，as the Agent makes use of the Q value to choose the next action，the next action is restrainted seriously by the high Q value，in the prophase.So，authors combine the ant conlony algorithm，roulette algorithm with Q-learning，hope that the problems will be resolved with the algorithm proposed.At last，the theory analysis and experiment result both demonstrate that the improved Q-learning is feasible and increases the learning efficiency.

Key words: multiagent reinforcement learning algorithm, ant colony algorithm, roulette algorithm

孟祥萍¹,王圣镔²,王欣欣². 基于蚁群算法和轮盘算法的多Agent Q学习[J]. 计算机工程与应用, 2009, 45(16): 60-62.

MENG Xiang-ping¹,WANG Sheng-bin²,WANG Xin-xin². Multiagent Q-learning based on ant colony algorithm and roulette algorithm[J]. Computer Engineering and Applications, 2009, 45(16): 60-62.

[1]	史春天，曾艳阳，侯守明. 群体智能算法在图像分割中的应用综述[J]. 计算机工程与应用, 2021, 57(8): 36-47.
[2]	张松灿，普杰信，司彦娜，孙力帆. 基于种群相似度的自适应改进蚁群算法及应用[J]. 计算机工程与应用, 2021, 57(8): 70-77.
[3]	卜冠南，刘建华，姜磊，张冬阳. 一种自适应分组的蚁群算法[J]. 计算机工程与应用, 2021, 57(6): 67-73.
[4]	朱佳莹，高茂庭. 融合粒子群与改进蚁群算法的AUV路径规划算法[J]. 计算机工程与应用, 2021, 57(6): 267-273.
[5]	马向华，张谦. 改进蚁群算法在机器人路径规划上的研究[J]. 计算机工程与应用, 2021, 57(5): 210-215.
[6]	李壮阔，常凯旋. 合作博弈的连续蚁群算法求解[J]. 计算机工程与应用, 2021, 57(24): 198-204.
[7]	王晓光，杨培蓓. 航运物流企业数字化转型设计与效果分析[J]. 计算机工程与应用, 2021, 57(21): 241-247.
[8]	张子然，黄卫华，陈阳，章政，李梓远. 基于双向搜索的改进蚁群路径规划算法[J]. 计算机工程与应用, 2021, 57(21): 270-277.
[9]	李二超，齐款款. 改进双向蚁群算法的移动机器人路径规划[J]. 计算机工程与应用, 2021, 57(18): 281-288.
[10]	何雅颖，范昕炜. 改进蚁群算法在机器人路径规划中的应用[J]. 计算机工程与应用, 2021, 57(16): 276-282.
[11]	付朝晖，刘长石. 多物流中心共同配送的车辆路径问题研究[J]. 计算机工程与应用, 2021, 57(16): 291-298.
[12]	张苏英，郭宝樑，陈灵芝，刘慧贤. 双向蚁群算法的智能消防疏散图路径规划[J]. 计算机工程与应用, 2021, 57(14): 259-266.
[13]	付朝晖，刘长石. 生鲜电商配送的开放式时变车辆路径问题研究[J]. 计算机工程与应用, 2021, 57(1): 271-278.
[14]	张松灿，普杰信，司彦娜，孙力帆. 蚁群算法在移动机器人路径规划中的应用综述[J]. 计算机工程与应用, 2020, 56(8): 10-19.
[15]	胡春阳，姜平，周根荣. 改进蚁群算法在AGV路径规划中的应用[J]. 计算机工程与应用, 2020, 56(8): 270-278.

基于蚁群算法和轮盘算法的多Agent Q学习

Multiagent Q-learning based on ant colony algorithm and roulette algorithm

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics