Path Planning for Mobile Robot Using Improved Reinforcement Learning Algorithm

doi:10.3778/j.issn.1002-8331.2011-0414

Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (18): 270-274.DOI: 10.3778/j.issn.1002-8331.2011-0414

Previous Articles Next Articles

Path Planning for Mobile Robot Using Improved Reinforcement Learning Algorithm

WANG Keyin, SHI Zhen, YANG Zhengcai, YANG Yahui, WANG Sishan

1.School of Automotive Engineering, Hubei University of Automotive Technology, Shiyan, Hubei 442002, China
2.Key Laboratory of Automotive Power Train and Electronics（Hubei University of Automotive Technology）, Shiyan, Hubei 442002, China
3.Institute of Automotive Engineers, Hubei University of Automotive Technology, Shiyan, Hubei 442002, China

Online:2021-09-15 Published:2021-09-13

改进强化学习算法应用于移动机器人路径规划

王科银，石振，杨正才，杨亚会，王思山

1.湖北汽车工业学院汽车工程学院，湖北十堰 442002
2.汽车动力传动与电子控制湖北省重点实验室（湖北汽车工业学院），湖北十堰 442002
3.湖北汽车工业学院汽车工程师学院，湖北十堰 442002

Abstract

Abstract:

In order to solve the problem of slow convergence, large number of iterations and unstable convergence when the mobile robot plans a path in unknown environment by using traditional reinforcement learning algorithm, an improved Q-learning algorithm is proposed. The artificial potential field is used to initialize the state value to make the larger state value closer to the target position, so as to guide the agent to move towards the target position. In the early stage of the algorithm, a large number of invalid iterations due to the environment exploration are reduced. The improved [ε]-greedy strategy is employed as agent’s action selection, the greedy factor [ε] is adjusted dynamically according to the convergence degree of the algorithm so as to balance the relationship between exploration and exploitation better and accelerate the convergence rate of the algorithm and improve the stability of the convergence results. The proposed algorithm is simulated and verified in the grid map based on Python Tkinter standardized library. Simulation results show that, compared with the traditional Q-learning algorithm, the planning time of improved Q-learning algorithm is reduced by 85.1%, the number of iterations is reduced by 74.7% before convergence, and the stability of the convergence results is greatly improved.

Key words: reinforcement learning, artificial potential field, greedy strategy, mobile robots, path planning

摘要：

为了解决传统的强化学习算法应用于移动机器人未知环境的路径规划时存在收敛速度慢、迭代次数多、收敛结果不稳定等问题，提出一种改进的Q-learning算法。在状态初始化时引入人工势场法，使得越靠近目标位置状态值越大，从而引导智能体朝目标位置移动，减少算法初始阶段因对环境探索产生的大量无效迭代；在智能体选择动作时改进[ε]-贪婪策略，根据算法的收敛程度动态调整贪婪因子[ε]，从而更好地平衡探索和利用之间的关系，在加快算法收敛速度的同时提高收敛结果的稳定性。基于Python的Tkinter标准化库搭建的格栅地图仿真结果表明，改进的Q-learning算法相较于传统算法在路径规划时间上缩短85.1%，收敛前迭代次数减少74.7%，同时算法的收敛结果稳定性也得到了提升。

关键词: 强化学习, 人工势场, 贪婪策略, 移动机器人, 路径规划

WANG Keyin, SHI Zhen, YANG Zhengcai, YANG Yahui, WANG Sishan. Path Planning for Mobile Robot Using Improved Reinforcement Learning Algorithm[J]. Computer Engineering and Applications, 2021, 57(18): 270-274.

王科银，石振，杨正才，杨亚会，王思山. 改进强化学习算法应用于移动机器人路径规划[J]. 计算机工程与应用, 2021, 57(18): 270-274.

[1]	HUAI Chuangfeng, GUO Long, JIA Xueyan, ZHANG Zihao. Improved A* Algorithm and Dynamic Window Method for Robot Dynamic Path Planning [J]. Computer Engineering and Applications, 2021, 57(8): 244-248.
[2]	LIAO Liefa, LI Haohan, LI Shuai, ZHU Helong, LI Zhijun. Research on Control Strategy of Soccer Robot Combined with Winner-Take-All [J]. Computer Engineering and Applications, 2021, 57(7): 136-143.
[3]	HAN Xiaowei, HAN Zhen, YUE Gaofeng, CUI Jianjiang. Path Planning Algorithm of Disaster Relief UAV Based on Optimized A [J]. Computer Engineering and Applications, 2021, 57(6): 232-238.
[4]	ZHU Jiaying, GAO Maoting. AUV Path Planning Based on Particle Swarm Optimization and Improved Ant Colony Optimization [J]. Computer Engineering and Applications, 2021, 57(6): 267-273.
[5]	LIU Jianyu, FAN Pingqing. Path Planning of Manipulator Based on Improved RRT*-connect Algorithm [J]. Computer Engineering and Applications, 2021, 57(6): 274-278.
[6]	WANG Di, LI Caihong, GUO Na, LIU Guoming, GAO Tengteng. Local Path Planning of Mobile Robot Based on Fuzzy Potential Field Method [J]. Computer Engineering and Applications, 2021, 57(6): 212-218.
[7]	JIANG Lin, FANG Dongjun, LEI Bin, LI Weigang. Research Status and Trend of Navigation Algorithm for Mobile Robot with Monocular Vision [J]. Computer Engineering and Applications, 2021, 57(5): 1-9.
[8]	MA Xianghua, ZHANG Qian. Research on Improved Ant Colony Algorithm in Robots Path Planning [J]. Computer Engineering and Applications, 2021, 57(5): 210-215.
[9]	WANG Xiao, TANG Lun, HE Xiaoyu, CHEN Qianbin. Multi-dimensional Resource Optimization of Service Function Chain Based on Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(4): 68-76.
[10]	LI Yuqi, LIU Zhiqian, CHENG Ningyi, WANG Yingying, ZHU Chunli. Path Planning of UAV Under Multi-constraint Conditions [J]. Computer Engineering and Applications, 2021, 57(4): 225-230.
[11]	LAI Jun, WEI Jingyi, CHEN Xiliang. Overview of Hierarchical Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(3): 72-79.
[12]	MA Zhihao, ZHU Xiangbin. Research on Quasi-hyperbolic Momentum Gradient for Adversarial Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(24): 90-99.
[13]	YANG Lingyao, ZHANG Aihua, ZHANG Jie, SONG Jiqiang. Real-Time Path Planning of Velocity Potential for Robot in Grid Map Environment [J]. Computer Engineering and Applications, 2021, 57(24): 290-295.
[14]	LI Baoshuai, YE Chunming. Job Shop Scheduling Problem Based on Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(23): 248-254.
[15]	GU Haiyan, CHEN Liang, WANG Duodian. Space-Time Cooperative Path Planning for Multi-UAV Using Model Predictive Control [J]. Computer Engineering and Applications, 2021, 57(23): 270-279.

Path Planning for Mobile Robot Using Improved Reinforcement Learning Algorithm

改进强化学习算法应用于移动机器人路径规划

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics