UAV Indoor 3D Track Planning Based on Improved Reinforcement Learning Algorithm

doi:10.3778/j.issn.1002-8331.2004-0363

Abstract

Abstract:

With the rise of indoor navigation and positioning technology, the application of Unmanned Aerial Vehicle（UAV） technology in indoor environments has been unprecedentedly developed, which puts forward higher requirements for UAV track planning ability. Due to the complexity of the indoor environmental space and the slow convergence rate of the existing reinforcement learning algorithms, this paper proposes an integrated method based on reinforcement learning. Firstly, the main obstacles and the nodes surrounding the main obstacles are judged through the starting and ending coordinate lines to reduce the space complexity. Secondly, in order to determine the direction of the target point and improve the convergence speed of the algorithm, the direction trend function is constructed through the mathematical relationship during the Q value initialization. Finally, the optimized algorithm is simulated and verified in three-dimensional grid map. The simulation results show that, compared with the standard Q-learning algorithm, the number of spatial search nodes of improved Q-learning algorithm is reduced by 55.49%, and the convergence time is shortened to 98.57%.

Key words: track planning, target direction, Main Obstacles and Surrounding Point（MO-SP）, Unmanned Aerial Vehicle（UAV）, reinforcement learning

摘要：

随着室内导航定位技术的兴起，无人机（Unmanned Aerial Vehicle，UAV）技术在室内环境中的应用得到前所未有的发展，对无人机航迹规划能力提出了更高的要求。由于室内环境空间较为复杂，且现有的强化学习算法收敛速度慢，提出一种基于强化学习的集成方法。通过给定的起点和终点位置的坐标连线，判断出主要障碍物及围绕主要障碍物的节点，减少无用节点的搜索；在Q值初始化过程中通过数学关系构造出方向趋向函数，确定出目标点所在的方向，以提高算法的收敛速度；在三维栅格地图中对优化算法进行仿真验证。仿真结果表明：改进的三维航迹规划算法使得空间搜索节点数目减少了55.49%，收敛时间缩短了98.57%。

关键词: 航迹规划, 目标方向, 主要障碍物和围绕点（MO-SP）, 无人机（UAV）, 强化学习

ZHANG Jun, ZHU Qingwei, YAN Junjie, WEN Bo. UAV Indoor 3D Track Planning Based on Improved Reinforcement Learning Algorithm[J]. Computer Engineering and Applications, 2021, 57(16): 175-181.

张俊，朱庆伟，严俊杰，温波. 改进强化学习算法的UAV室内三维航迹规划[J]. 计算机工程与应用, 2021, 57(16): 175-181.

[1]	HOU Xuan, XUE Fei, CHEN Tao. UAV Target Detection on Quantum Multi-pattern Recognition Optimization Algorithm [J]. Computer Engineering and Applications, 2021, 57(7): 228-236.
[2]	WANG Xiao, TANG Lun, HE Xiaoyu, CHEN Qianbin. Multi-dimensional Resource Optimization of Service Function Chain Based on Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(4): 68-76.
[3]	YU Xiaojie, HE Yong, LIU Shenghua. Improved ORB Feature Optical Flow Algorithm for Indoor Positioning of Unmanned Aerial Vehicle [J]. Computer Engineering and Applications, 2021, 57(4): 266-271.
[4]	LAI Jun, WEI Jingyi, CHEN Xiliang. Overview of Hierarchical Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(3): 72-79.
[5]	MA Zhihao, ZHU Xiangbin. Research on Quasi-hyperbolic Momentum Gradient for Adversarial Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(24): 90-99.
[6]	LIN Shubin, WU Guishan, XU Jiayun, YANG Wenyuan. Multi-frame Surveillance of Correlation Filter in UAV Object Tracking [J]. Computer Engineering and Applications, 2021, 57(24): 152-160.
[7]	LI Baoshuai, YE Chunming. Job Shop Scheduling Problem Based on Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(23): 248-254.
[8]	WANG Jun, CAO Lei, CHEN Xiliang, LAI Jun, ZHANG Legui. Overview on Reinforcement Learning of Multi-agent Game [J]. Computer Engineering and Applications, 2021, 57(21): 1-13.
[9]	CHENG Yi, HAO Mimi. Path Planning for Indoor Mobile Robot with Improved Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2021, 57(21): 256-262.
[10]	KUANG Liqun, LI Siyuan, FENG Li, HAN Xie, XU Qingyu. Application of Deep Reinforcement Learning Algorithm on Intelligent Military Decision System [J]. Computer Engineering and Applications, 2021, 57(20): 271-278.
[11]	KONG Songtao, LIU Chichi, SHI Yong, XIE Yi, WANG Kun. Review of Application Prospect of Deep Reinforcement Learning in Intelligent Manufacturing [J]. Computer Engineering and Applications, 2021, 57(2): 49-59.
[12]	LI Hao, NING Haoyu, KANG Yan, LIANG Wentao, HUO Wen. SMRFGAN Model for Text Emotion Transfer [J]. Computer Engineering and Applications, 2021, 57(2): 170-176.
[13]	ZHANG Rongxia, WU Changxu, SUN Tongchao, ZHAO Zengshun. Progress on Deep Reinforcement Learning in Path Planning [J]. Computer Engineering and Applications, 2021, 57(19): 44-56.
[14]	CHENG Qing, FAN Man, LI Yandong, ZHAO Yuan, LI Chenglong. Review on Semantic Segmentation of UAV Aerial Images [J]. Computer Engineering and Applications, 2021, 57(19): 57-69.
[15]	YANG Xueyu, CHEN Jianping, FU Qiming, LU You, WU Hongjie. Deep Deterministic Policy Gradient Algorithm Based on Stochastic Variance Reduction Method [J]. Computer Engineering and Applications, 2021, 57(19): 104-111.

UAV Indoor 3D Track Planning Based on Improved Reinforcement Learning Algorithm

改进强化学习算法的UAV室内三维航迹规划

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics