基于ACCA的Option自动生成算法

计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (19): 39-40.

基于ACCA的Option自动生成算法

胡明辉,殷苌茗,李立云

长沙理工大学计算机与通信工程学院，长沙 410076

收稿日期:2007-09-27 修回日期:2007-12-13 出版日期:2008-07-01 发布日期:2008-07-01
通讯作者: 胡明辉

Option automatic generation algorithm based on ACCA

HU Ming-hui,YIN Chang-ming,LI Li-yun

College of Computer and Communicational Engineering，Changsha University of Science and Technology，Changsha 410076，China

Received:2007-09-27 Revised:2007-12-13 Online:2008-07-01 Published:2008-07-01
Contact: HU Ming-hui

摘要/Abstract

摘要： 提出了一种新的分层强化学习（HRL）Option自动生成算法，以Agent在学习初始阶段探测到的状态空间为输入，并采用改进的蚁群聚类算法（ACCA）对其进行聚类，在聚类后的各状态子集上通过经验回放学习产生内部策略集，从而生成Option，仿真实验验证了该算法是有效的。

关键词: 分层强化学习, Option, 蚁群聚类算法, 经验回放

Abstract: A new algorithm for Option automatic generation of hierarchical reinforcement learning is presented.The algorithm takes the state space explored by Agent as input in the initial learning phase and clusters the states employing Ant Colony Clustering Algorithm（ACCA）.Based on the clustered state sets，the intra-strategies are learned by an experience replay procedure.As a result，the Options are generated.The validity of the algorithm is demonstrated by simulation experiments.

Key words: hierarchical reinforcement learning, Option, Ant Colony Clustering Algorithm（ACCA）, experience replay

胡明辉,殷苌茗,李立云. 基于ACCA的Option自动生成算法[J]. 计算机工程与应用, 2008, 44(19): 39-40.

HU Ming-hui,YIN Chang-ming,LI Li-yun

. Option automatic generation algorithm based on ACCA[J]. Computer Engineering and Applications, 2008, 44(19): 39-40.

[1]	赖俊，魏竞毅，陈希亮. 分层强化学习综述[J]. 计算机工程与应用, 2021, 57(3): 72-79.
[2]	唐蕾，刘广钟. 改进TD3算法在四旋翼无人机避障中的应用[J]. 计算机工程与应用, 2021, 57(11): 254-259.
[3]	司彦娜，普杰信，臧绍飞. 基于残差梯度法的神经网络Q学习算法[J]. 计算机工程与应用, 2020, 56(18): 137-142.
[4]	王昕宇，罗可. 具有全局记忆的LF蚁群聚类算法[J]. 计算机工程与应用, 2019, 55(20): 52-57.
[5]	陈建平，康怡怡，胡龄爻，陆悠，吴宏杰，傅启明. 基于多线程并行强化学习的建筑节能方法[J]. 计算机工程与应用, 2019, 55(15): 219-227.
[6]	胡坤，余雪丽，李志. 一种改进的自动分层算法BMAXQ[J]. 计算机工程与应用, 2011, 47(30): 1-3.
[7]	赵宝江. 蚁群聚类算法的T-S模糊模型辨识[J]. 计算机工程与应用, 2011, 47(21): 153-156.
[8]	张纹华，贾智平，李新. 利用蚁群聚类检测应用层DDoS攻击的方法[J]. 计算机工程与应用, 2011, 47(14): 99-102.
[9]	沈明明，毛力. 聚类邻域自适应调整的多载蚁群算法[J]. 计算机工程与应用, 2010, 46(28): 43-45.
[10]	梁冰，陈德运. 基于蚁群优化聚类算法的DNA序列分类方法[J]. 计算机工程与应用, 2010, 46(25): 124-126.
[11]	邓可,林杰. 基于蚁群聚类算法的大规模定制产品模块划分研究[J]. 计算机工程与应用, 2008, 44(2): 130-132.
[12]	郭会林,苏一丹. 一种基于混合策略的蚁群聚类算法[J]. 计算机工程与应用, 2008, 44(16): 154-156.
[13]	程晓北,沈晶,刘海波,顾国昌,张国印. 分层强化学习研究进展[J]. 计算机工程与应用, 2008, 44(13): 1-5.
[14]	孟岩,刘希玉,刘艳丽. 一种基于蚁群算法的K-means算法 ——在公路运输枢纽宏观布局规划中的应用 [J]. 计算机工程与应用, 2008, 44(1): 179-182.