Reinforcement learning path planning algorithm based on gravitational potential field and trap search

doi:10.3778/j.issn.1002-8331.1704-0427

Computer Engineering and Applications ›› 2018, Vol. 54 ›› Issue (16): 129-134.DOI: 10.3778/j.issn.1002-8331.1704-0427

Previous Articles Next Articles

Reinforcement learning path planning algorithm based on gravitational potential field and trap search

DONG Peifang1, ZHANG Zhi’an1, MEI Xinhu2, ZHU Shuo1

1.School of Mechanical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
2.School of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing 210094, China

Online:2018-08-15 Published:2018-08-09

引入势场及陷阱搜索的强化学习路径规划算法

董培方1，张志安1，梅新虎2，朱朔1

1.南京理工大学机械工程学院，南京 210094
2.南京理工大学计算机科学与技术学院，南京 210094

Abstract

Abstract: It is difficult to obtain a better path for mobile robot in complex environment. The Q-learning algorithm based on the Markov process can achieve better path through learning by trial and error. But this algorithm has a slow convergence speed and a large number of iterations, the trial and error approach cannot be applied in the real environment. Search trap area on the basis of adding gravitational potential field as the initial environment priori information in the Q-learning algorithm, remove [Q] value iteration in concave trap area which speeding up the convergence rate of path planning. At the same time, cancel the trial and error learning to the obstacle, the algorithm avoids obstacles effectively in the initial state. It can be applied in the real environment. Python and pygame modules are used to build complex maps to verify the path planning effect of the improved Q-learning algorithm with the addition of initial gravitational potential field and trap search. The simulation results show that the improved algorithm can reach the target position quickly and effectively after fewer iterations.

Key words: path planning, reinforcement learning, artificial potential field, trap search, Q value initialization

摘要： 移动机器人在复杂环境中移动难以得到较优的路径，基于马尔可夫过程的Q学习（Q-learning）算法能通过试错学习取得较优的路径，但这种方法收敛速度慢，迭代次数多，且试错方式无法应用于真实的环境中。在Q-learning算法中加入引力势场作为初始环境先验信息，在其基础上对环境进行陷阱区域逐层搜索，剔除凹形陷阱区域[Q]值迭代，加快了路径规划的收敛速度。同时取消对障碍物的试错学习，使算法在初始状态就能有效避开障碍物，适用于真实环境中直接学习。利用python及pygame模块建立复杂地图，验证加入初始引力势场和陷阱搜索的改进Q-learning算法路径规划效果。仿真实验表明，改进算法能在较少的迭代次数后，快速有效地到达目标位置，且路径较优。

关键词: 路径规划, 强化学习, 人工势场, 陷阱搜索, Q值初始化

DONG Peifang1, ZHANG Zhi’an1, MEI Xinhu2, ZHU Shuo1. Reinforcement learning path planning algorithm based on gravitational potential field and trap search[J]. Computer Engineering and Applications, 2018, 54(16): 129-134.

董培方1，张志安1，梅新虎2，朱朔1. 引入势场及陷阱搜索的强化学习路径规划算法[J]. 计算机工程与应用, 2018, 54(16): 129-134.

[1]	HUAI Chuangfeng, GUO Long, JIA Xueyan, ZHANG Zihao. Improved A* Algorithm and Dynamic Window Method for Robot Dynamic Path Planning [J]. Computer Engineering and Applications, 2021, 57(8): 244-248.
[2]	LIAO Liefa, LI Haohan, LI Shuai, ZHU Helong, LI Zhijun. Research on Control Strategy of Soccer Robot Combined with Winner-Take-All [J]. Computer Engineering and Applications, 2021, 57(7): 136-143.
[3]	HAN Xiaowei, HAN Zhen, YUE Gaofeng, CUI Jianjiang. Path Planning Algorithm of Disaster Relief UAV Based on Optimized A [J]. Computer Engineering and Applications, 2021, 57(6): 232-238.
[4]	ZHU Jiaying, GAO Maoting. AUV Path Planning Based on Particle Swarm Optimization and Improved Ant Colony Optimization [J]. Computer Engineering and Applications, 2021, 57(6): 267-273.
[5]	LIU Jianyu, FAN Pingqing. Path Planning of Manipulator Based on Improved RRT*-connect Algorithm [J]. Computer Engineering and Applications, 2021, 57(6): 274-278.
[6]	WANG Di, LI Caihong, GUO Na, LIU Guoming, GAO Tengteng. Local Path Planning of Mobile Robot Based on Fuzzy Potential Field Method [J]. Computer Engineering and Applications, 2021, 57(6): 212-218.
[7]	JIANG Lin, FANG Dongjun, LEI Bin, LI Weigang. Research Status and Trend of Navigation Algorithm for Mobile Robot with Monocular Vision [J]. Computer Engineering and Applications, 2021, 57(5): 1-9.
[8]	MA Xianghua, ZHANG Qian. Research on Improved Ant Colony Algorithm in Robots Path Planning [J]. Computer Engineering and Applications, 2021, 57(5): 210-215.
[9]	WANG Xiao, TANG Lun, HE Xiaoyu, CHEN Qianbin. Multi-dimensional Resource Optimization of Service Function Chain Based on Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(4): 68-76.
[10]	LI Yuqi, LIU Zhiqian, CHENG Ningyi, WANG Yingying, ZHU Chunli. Path Planning of UAV Under Multi-constraint Conditions [J]. Computer Engineering and Applications, 2021, 57(4): 225-230.
[11]	LAI Jun, WEI Jingyi, CHEN Xiliang. Overview of Hierarchical Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(3): 72-79.
[12]	MA Zhihao, ZHU Xiangbin. Research on Quasi-hyperbolic Momentum Gradient for Adversarial Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(24): 90-99.
[13]	YANG Lingyao, ZHANG Aihua, ZHANG Jie, SONG Jiqiang. Real-Time Path Planning of Velocity Potential for Robot in Grid Map Environment [J]. Computer Engineering and Applications, 2021, 57(24): 290-295.
[14]	LI Baoshuai, YE Chunming. Job Shop Scheduling Problem Based on Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(23): 248-254.
[15]	GU Haiyan, CHEN Liang, WANG Duodian. Space-Time Cooperative Path Planning for Multi-UAV Using Model Predictive Control [J]. Computer Engineering and Applications, 2021, 57(23): 270-279.

Reinforcement learning path planning algorithm based on gravitational potential field and trap search

引入势场及陷阱搜索的强化学习路径规划算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics