深度强化学习在室内无人机目标搜索中的应用

doi:10.3778/j.issn.1002-8331.1907-0044

计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (17): 156-160.DOI: 10.3778/j.issn.1002-8331.1907-0044

深度强化学习在室内无人机目标搜索中的应用

赖俊，饶瑞

陆军工程大学指挥控制工程学院，南京 210007

出版日期:2020-09-01 发布日期:2020-08-31

Application of Deep Reinforcement Learning in Indoor UAV Target Search

LAI Jun, RAO Rui

College of Command and Control Engineering, Army Engineering University of PLA, Nanjing 210007, China

Online:2020-09-01 Published:2020-08-31

摘要/Abstract

摘要：

针对室内无人机随机目标搜索效率不高、准确率低等问题，提出了一种基于空间位置标注的好奇心驱动的深度强化学习方法。用正六边形对探索空间进行区域划分，并标记无人机在各区域的访问次数，将其作为好奇心，产生内部奖励，以鼓励无人机不断探索新领域，有效避免其陷入到局部区域；训练时采用近端策略优化算法（PPO）优化神经网络参数，该算法能使无人机更快找到最优搜索策略，较好躲避障碍物，有效缩短训练周期，提升搜索效率和准确率。

关键词: 深度强化学习, 室内搜索, 好奇心

Abstract:

Inview of the low efficiency and low accuracy of indoor random target search by UAV, this paper proposes the deep reinforcement learning algorithm by curiosity-driven exploration based on spatial location annotation. Firstly, it divides the exploration space by regular hexagon, and records the number of the UAV’s visiting in each single area. Then, it generates the internal rewards by the visiting records, which can encourage the UAV to explore new areas continuously and effectively avoid LUAV sinking into local areas. When training the neural network, it uses PPO（Proximal Policy Optimization） algorithm to optimize the parameters, which can find the optimal search strategy faster, avoid the obstacles better, shorten the training period, and improve the search efficiency and accuracy.

Key words: deep reinforcement learning, indoor search, curiosity

赖俊，饶瑞. 深度强化学习在室内无人机目标搜索中的应用[J]. 计算机工程与应用, 2020, 56(17): 156-160.

LAI Jun, RAO Rui. Application of Deep Reinforcement Learning in Indoor UAV Target Search[J]. Computer Engineering and Applications, 2020, 56(17): 156-160.

[1]	魏婷婷, 袁唯淋, 罗俊仁, 张万鹏. 智能博弈对抗中的对手建模方法及其应用综述[J]. 计算机工程与应用, 2022, 58(9): 19-29.
[2]	高敬鹏, 胡欣瑜, 江志烨. 改进DDPG无人机航迹规划算法[J]. 计算机工程与应用, 2022, 58(8): 264-272.
[3]	赵庶旭, 元琳, 张占平. 多智能体边缘计算任务卸载[J]. 计算机工程与应用, 2022, 58(6): 177-182.
[4]	邓心, 那俊, 张瀚铎, 王昱林, 张斌. 基于深度强化学习的智能灯个性化调节方法[J]. 计算机工程与应用, 2022, 58(6): 264-270.
[5]	徐博, 周建国, 吴静, 罗威. 可编程数据平面下基于DDPG的路由优化方法[J]. 计算机工程与应用, 2022, 58(3): 143-150.
[6]	吴亚丽, 王君虎, 郑帅龙. 基于改进双重深度Q网络的入侵检测模型[J]. 计算机工程与应用, 2022, 58(16): 102-110.
[7]	宋浩楠, 赵刚, 孙若莹. 基于深度强化学习的知识推理研究进展综述[J]. 计算机工程与应用, 2022, 58(1): 12-25.
[8]	牛鹏飞, 王晓峰, 芦磊, 张九龙. 强化学习在车辆路径问题中的研究综述[J]. 计算机工程与应用, 2022, 58(1): 41-55.
[9]	马志豪，朱响斌. 拟双曲动量梯度的对抗深度强化学习研究[J]. 计算机工程与应用, 2021, 57(24): 90-99.
[10]	田维安，陈红梅，周丽华. 基于相似用户好奇心的多样性推荐方法[J]. 计算机工程与应用, 2021, 57(23): 113-121.
[11]	李宝帅，叶春明. 深度强化学习算法求解作业车间调度问题[J]. 计算机工程与应用, 2021, 57(23): 248-254.
[12]	成怡，郝密密. 改进深度强化学习的室内移动机器人路径规划[J]. 计算机工程与应用, 2021, 57(21): 256-262.
[13]	况立群，李思远，冯利，韩燮，徐清宇. 深度强化学习算法在智能军事决策中的应用[J]. 计算机工程与应用, 2021, 57(20): 271-278.
[14]	孔松涛，刘池池，史勇，谢义，王堃. 深度强化学习在智能制造中的应用展望综述[J]. 计算机工程与应用, 2021, 57(2): 49-59.
[15]	张荣霞，武长旭，孙同超，赵增顺. 深度强化学习及在路径规划中的研究进展[J]. 计算机工程与应用, 2021, 57(19): 44-56.

深度强化学习在室内无人机目标搜索中的应用

Application of Deep Reinforcement Learning in Indoor UAV Target Search

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics