Parallel reinforcement learning algorithm and its application

doi:10.3778/j.issn.1002-8331.2009.34.008

Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (34): 25-28.DOI: 10.3778/j.issn.1002-8331.2009.34.008

• 博士论坛 • Previous Articles Next Articles

Parallel reinforcement learning algorithm and its application

MENG Wei¹，HAN Xue-dong²

1.Information School，Beijing Forestry University，Beijing 100083，China
2.706 Institute of China Aerospace Science and Industry Corporation，Beijing 100854，China

Received:2009-08-11 Revised:2009-10-09 Online:2009-12-01 Published:2009-12-01
Contact: MENG Wei

并行强化学习算法及其应用研究

孟伟¹，韩学东²

1.北京林业大学信息学院，北京 100083
2.中国航天科工集团 706所，北京 100854

通讯作者: 孟伟

Abstract

Abstract: Reinforcement learning is an important machine learning method.However，slow convergence has been one of main problem in practice.To improve the efficiency of reinforcement learning，this paper proposes parallel reinforcement learning algorithm.There are multiple agents in learning system.In a learning episode，each agent learns independently.After a learning episode，the results of all agents are fused based on D-S evidence theory so as to achieve common result，which are shared by all agents in next learning episode.Experiments show the feasibility and efficiency of the algorithm.

Key words: parallel algorithms, reinforcement learning, Q-learning, D-S evidence theory, path plan

摘要： 强化学习是一种重要的机器学习方法，然而在实际应用中，收敛速度缓慢是其主要不足之一。为了提高强化学习的效率，提出了一种并行强化学习算法。多个同时学习，在各自学习一定周期后，利用D-S证据利用对学习结果进行融合，然后在融合结果的基础上，各进行下一周期的学习，从而实现提高整个系统学习效率的目的。实验结果表明了该方法的可行性和有效性。

关键词: 并行算法, 强化学习, Q-学习, D-S证据理论, 路径规划

CLC Number:

TP18

MENG Wei¹，HAN Xue-dong². Parallel reinforcement learning algorithm and its application[J]. Computer Engineering and Applications, 2009, 45(34): 25-28.

孟伟¹，韩学东². 并行强化学习算法及其应用研究[J]. 计算机工程与应用, 2009, 45(34): 25-28.

[1]	HUAI Chuangfeng, GUO Long, JIA Xueyan, ZHANG Zihao. Improved A* Algorithm and Dynamic Window Method for Robot Dynamic Path Planning [J]. Computer Engineering and Applications, 2021, 57(8): 244-248.
[2]	LIAO Liefa, LI Haohan, LI Shuai, ZHU Helong, LI Zhijun. Research on Control Strategy of Soccer Robot Combined with Winner-Take-All [J]. Computer Engineering and Applications, 2021, 57(7): 136-143.
[3]	HAN Xiaowei, HAN Zhen, YUE Gaofeng, CUI Jianjiang. Path Planning Algorithm of Disaster Relief UAV Based on Optimized A [J]. Computer Engineering and Applications, 2021, 57(6): 232-238.
[4]	ZHU Jiaying, GAO Maoting. AUV Path Planning Based on Particle Swarm Optimization and Improved Ant Colony Optimization [J]. Computer Engineering and Applications, 2021, 57(6): 267-273.
[5]	LIU Jianyu, FAN Pingqing. Path Planning of Manipulator Based on Improved RRT*-connect Algorithm [J]. Computer Engineering and Applications, 2021, 57(6): 274-278.
[6]	WANG Di, LI Caihong, GUO Na, LIU Guoming, GAO Tengteng. Local Path Planning of Mobile Robot Based on Fuzzy Potential Field Method [J]. Computer Engineering and Applications, 2021, 57(6): 212-218.
[7]	JIANG Lin, FANG Dongjun, LEI Bin, LI Weigang. Research Status and Trend of Navigation Algorithm for Mobile Robot with Monocular Vision [J]. Computer Engineering and Applications, 2021, 57(5): 1-9.
[8]	MA Xianghua, ZHANG Qian. Research on Improved Ant Colony Algorithm in Robots Path Planning [J]. Computer Engineering and Applications, 2021, 57(5): 210-215.
[9]	WANG Xiao, TANG Lun, HE Xiaoyu, CHEN Qianbin. Multi-dimensional Resource Optimization of Service Function Chain Based on Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(4): 68-76.
[10]	ZHANG Junjie, ZHANG Cong, ZHAO Hanjie. Dueling Deep Q Network Algorithm with State Value Reuse [J]. Computer Engineering and Applications, 2021, 57(4): 134-140.
[11]	LI Yuqi, LIU Zhiqian, CHENG Ningyi, WANG Yingying, ZHU Chunli. Path Planning of UAV Under Multi-constraint Conditions [J]. Computer Engineering and Applications, 2021, 57(4): 225-230.
[12]	LAI Jun, WEI Jingyi, CHEN Xiliang. Overview of Hierarchical Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(3): 72-79.
[13]	MA Zhihao, ZHU Xiangbin. Research on Quasi-hyperbolic Momentum Gradient for Adversarial Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(24): 90-99.
[14]	YANG Lingyao, ZHANG Aihua, ZHANG Jie, SONG Jiqiang. Real-Time Path Planning of Velocity Potential for Robot in Grid Map Environment [J]. Computer Engineering and Applications, 2021, 57(24): 290-295.
[15]	LI Baoshuai, YE Chunming. Job Shop Scheduling Problem Based on Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(23): 248-254.

Parallel reinforcement learning algorithm and its application

并行强化学习算法及其应用研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics