计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (19): 44-56.DOI: 10.3778/j.issn.1002-8331.2104-0369

• 热点与综述 • 上一篇    下一篇

深度强化学习及在路径规划中的研究进展

张荣霞,武长旭,孙同超,赵增顺   

  1. 山东科技大学 电子信息工程学院,山东 青岛 266590
  • 出版日期:2021-10-01 发布日期:2021-09-29

Progress on Deep Reinforcement Learning in Path Planning

ZHANG Rongxia, WU Changxu, SUN Tongchao, ZHAO Zengshun   

  1. College of Electronic and Information Engineering, Shandong University of Science and Technology, Qingdao, Shandong 266590, China
  • Online:2021-10-01 Published:2021-09-29

摘要:

路径规划的目的是让机器人在移动过程中既能避开障碍物,又能快速规划出最短路径。在分析基于强化学习的路径规划算法优缺点的基础上,引出能够在复杂动态环境下进行良好路径规划的典型深度强化学习DQN(Deep Q-learning Network)算法。深入分析了DQN算法的基本原理和局限性,对比了各种DQN变种算法的优势和不足,进而从训练算法、神经网络结构、学习机制、AC(Actor-Critic)框架的多种变形四方面进行了分类归纳。提出了目前基于深度强化学习的路径规划方法所面临的挑战和亟待解决的问题,并展望了未来的发展方向,可为机器人智能路径规划及自动驾驶等方向的发展提供参考。

关键词: 深度强化学习, 路径规划, 神经网络结构, AC框架

Abstract:

The purpose of path planning is to allow the robot to avoid obstacles and quickly plan the shortest path during the movement. Having analyzed the advantages and disadvantages of the reinforcement learning based path planning algorithm, the paper derives a typical deep reinforcement learning, Deep Q-learning Network(DQN) algorithm that can perform excellent path planning in a complex dynamic environment. Firstly, the basic principles and limitations of the DQN algorithm are analyzed in depth, and the advantages and disadvantages of various DQN variant algorithms are compared from four aspects:the training algorithm, the neural network structure, the learning mechanism and AC(Actor-Critic) framework. The paper puts forward the current challenges and problems to be solved in the path planning method based on deep reinforcement learning. The future development directions are proposed, which can provide reference for the development of intelligent path planning and autonomous driving.

Key words: deep reinforcement learning, path planning, neural network structure, Actor-Critic(AC) framework