Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (21): 256-262.DOI: 10.3778/j.issn.1002-8331.2106-0040

Previous Articles     Next Articles

Path Planning for Indoor Mobile Robot with Improved Deep Reinforcement Learning

CHENG Yi, HAO Mimi   

  1. School of Control Science and Engineering, Tiangong University, Tianjin 300387, China
  • Online:2021-11-01 Published:2021-11-04

改进深度强化学习的室内移动机器人路径规划

成怡,郝密密   

  1. 天津工业大学 控制科学与工程学院,天津 300387

Abstract:

An improved deep reinforcement learning algorithm based on deep image information is proposed in order to solve the problem of poor exploration ability and sparse environment state space of traditional deep reinforcement learning in path planning of the mobile robot in unknown indoor environment. The depth image information and target position information directly obtained by the Kinect visual sensor are used as the input of the network. The linear velocity and angular velocity of the robot are used as the output of the next action command. An improved reward and punishment function is designed to increase the reward value of the algorithm. The state space is optimized. To a certain extent, it alleviates the problem of reward sparsity. The simulation results show that the improved algorithm can improve the exploration ability of the robot and optimize the path trajectory. The robot can effectively avoid obstacles and plan a shorter path. Compared with DQN algorithm, the average path length in simple environment is shortened by 21.4%. The average path length in complex environment is reduced by 11.3%.

Key words: path planning, depth image information, Kinect visual sensor, deep reinforcement learning, reward and punishment function, exploration ability

摘要:

为了解决传统深度强化学习在室内未知环境下移动机器人路径规划中存在探索能力差和环境状态空间奖励稀疏的问题,提出了一种基于深度图像信息的改进深度强化学习算法。利用Kinect视觉传感器直接获取的深度图像信息和目标位置信息作为网络的输入,以机器人的线速度和角速度作为下一步动作指令的输出。设计了改进的奖惩函数,提高了算法的奖励值,优化了状态空间,在一定程度上缓解了奖励稀疏的问题。仿真结果表明,改进算法提高了机器人的探索能力,优化了路径轨迹,使机器人有效地避开了障碍物,规划出更短的路径,简单环境下比DQN算法的平均路径长度缩短了21.4%,复杂环境下平均路径长度缩短了11.3%。

关键词: 路径规划, 深度图像信息, Kinect 视觉传感器, 深度强化学习, 奖惩函数, 探索能力