Application for Improved TD3 Algorithm in Obstacle Avoidance of Quad-Rotor UAV

doi:10.3778/j.issn.1002-8331.2003-0163

Abstract

Abstract:

In order to improve the intelligent obstacle avoidance performance of Unmanned Aerial Vehicle（UAV）, an improved algorithm called Improved Twin Delayed Deep Deterministic Policy Gradient（I-TD3）based on Twin Delayed Deep Deterministic Policy Gradient（TD3）is proposed. According to the different purposes of experience buffer pools, combined with the Prioritized Experience Replay and the Experience Replay, the success flight experience and failure flight experience are separated by setting two experience buffer pools to enhance the sample efficiency of effective experience, alleviate the problem of low training efficiency prompted by too much invalid experience. Meantime, the reward function is ameliorated to solve the problem of poor training effect caused by unreasonable reward setting. By applying the simulation experiment of quad-rotor UVA on AirSim platform, it is indicated that the obstacle avoidance effect of I-TD3 algorithm is superior to the TD3 algorithm and the Deep Deterministic Policy Gradient（DDPG） algorithm.

Key words: Twin Delayed Deep Deterministic Policy Gradient（TD3）, prioritized experience replay, obstacle avoidance, quad-rotor unmanned aerial vehicle

摘要：

为了提高无人机（Unmanned Aerial Vehicle，UAV）系统的智能避障性能，提出了一种基于双延迟深度确定性策略梯度（Twin Delayed Deep Deterministic Policy Gradient，TD3）的改进算法（Improved Twin Delayed Deep Deterministic Policy Gradient，I-TD3）。该算法通过设置两个经验缓存池分离成功飞行经验和失败飞行经验，并根据两个经验缓存池的不同使用目的分别结合优先经验回放（Prioritized Experience Replay）方法和经验回放（Experience Replay）方法，提高有效经验的采样效率，缓解因无效经验过高导致的训练效率低问题。改进奖励函数，解决因奖励设置不合理导致的训练效果差问题。在AirSim平台上实现仿真实验，结果表明在四旋翼无人机的避障问题上，I-TD3算法的避障效果优于TD3算法和深度确定性策略梯度（Deep Deterministic Policy Gradient，DDPG）算法。

关键词: 双延迟深度确定性策略梯度（TD3）, 优先经验回放, 避障, 四旋翼无人机

TANG Lei, LIU Guangzhong. Application for Improved TD3 Algorithm in Obstacle Avoidance of Quad-Rotor UAV[J]. Computer Engineering and Applications, 2021, 57(11): 254-259.

唐蕾，刘广钟. 改进TD3算法在四旋翼无人机避障中的应用[J]. 计算机工程与应用, 2021, 57(11): 254-259.

[1]	LIAO Liefa, LI Haohan, LI Shuai, ZHU Helong, LI Zhijun. Research on Control Strategy of Soccer Robot Combined with Winner-Take-All [J]. Computer Engineering and Applications, 2021, 57(7): 136-143.
[2]	YANG Lingyao, ZHANG Aihua, ZHANG Jie, SONG Jiqiang. Real-Time Path Planning of Velocity Potential for Robot in Grid Map Environment [J]. Computer Engineering and Applications, 2021, 57(24): 290-295.
[3]	JIA Wenyou, JIANG Lei, CAO Ziyang, LIANG Lidong. Optimization Trajectory Planning of Industrial Robot with Energy Consumption [J]. Computer Engineering and Applications, 2021, 57(15): 245-250.
[4]	HUA Hong, ZHANG Zhi’an, SHI Zhenwen, CHEN Guanxing. Robot Path Planning Method of Multiple A* Algorithm in Dynamic Environment [J]. Computer Engineering and Applications, 2021, 57(10): 173-180.
[5]	YU Henian, BAI Hua, LI Chao. Research and Simulation on Path Planning of Warehouse Multi-AGV System [J]. Computer Engineering and Applications, 2020, 56(2): 233-241.
[6]	XU Yuanyun, HE Ming, LIU Jintao, ZHOU Bo, YANG Cheng. Multi-agent Obstacle Avoidance Algorithm Improved by Collision Cone Detection [J]. Computer Engineering and Applications, 2020, 56(18): 63-68.
[7]	WANG Fan, LI Tiejun, LIU Jinyue, ZHAO Haiwen. Research on Autonomous Path Planning and Obstacle Avoidance of Building Robot Based on BIM [J]. Computer Engineering and Applications, 2020, 56(17): 224-230.
[8]	WU Zhaohan, RONG Xuewen, FAN Yong. Survey on Research Status of Blind-Guiding Robots [J]. Computer Engineering and Applications, 2020, 56(14): 1-13.
[9]	MA Xingzao, ZHAO Jiannan, ZHU Qiyuan, FU Qinbing, HU Cheng, LEI Fang, YUE Shigang. Study on Obstacle Avoidance Method of UAV Based on LGMD [J]. Computer Engineering and Applications, 2019, 55(15): 250-256.
[10]	HUA Xiaofeng, DUAN Jianmin, TIAN Xiaosheng. Research on vehicle obstacle avoidance based on restricted areas penalty function and MPC prediction multiplication [J]. Computer Engineering and Applications, 2018, 54(15): 131-138.
[11]	LI Qing, ZHENG Lixin, PAN Shuwan, ZHANG Yukun, XIE Yishou. Method of mobile robot navigation using monocular vision [J]. Computer Engineering and Applications, 2017, 53(4): 223-227.
[12]	WANG Ming1, WANG Rui1, LI Xiaojuan1, GUAN Yong1, ZHANG Jie2, WEI Hongxing3. Verification of mobile robot obstacle avoidance strategies in uncertain environment based on formal modeling and probabilistic analysis [J]. Computer Engineering and Applications, 2016, 52(10): 31-38.
[13]	CAO Jie, ZHU Ningning. Multi-robot cooperative carrying in dynamic environment [J]. Computer Engineering and Applications, 2013, 49(23): 252-256.
[14]	LIU Xiusong. On controlled algorithm for vehicle avoidance obstacle [J]. Computer Engineering and Applications, 2012, 48(2): 230-234.
[15]	YANG Jian1, DONG Mi1, CHUNYU Jiangmin2. New reactive target-tracking and obstacle avoidance control in dynamic environment [J]. Computer Engineering and Applications, 2012, 48(15): 220-226.

Application for Improved TD3 Algorithm in Obstacle Avoidance of Quad-Rotor UAV

改进TD3算法在四旋翼无人机避障中的应用

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics