Application of RVO-DDPG Algorithm in Multi-UAV Consolidation Route Planning

doi:10.3778/j.issn.1002-8331.2107-0017

Abstract

Abstract: In order to deal with the problem of large calculation and long time in the traditional intelligent optimization algorithm for multi-UAV assembly route planning in uncertain and complex environments, a deep deterministic policy gradient（DDPC） algorithm based on the reciprocal velocity obstacle（RVO） is proposed. For dynamic obstacles in uncertain environments and UAVs in formations, the UAV heading is adjusted by the speed obstacle method to avoid collisions, which improves the convergence speed of the algorithm. A reward function based on comprehensive cost is designed, which transforms the multi-objective optimization problem in multi-UAV route planning into the reward function design problem of DDPG algorithm. Based on the Pycharm software platform, the performance of the algorithm is verified through simulation, and the RVO-DDPG algorithm is compared with a variety of algorithms. Simulation experiments show that the RVO-DDPG algorithm has faster decision-making speed and better practicability.

Key words: unmanned aerial vehicle（UAV）, route planning, formation assembly, deep deterministic policy gradient（DDPG）, reciprocal velocity obstacle（RVO）

摘要： 针对传统智能优化算法处理不确定复杂环境下多UAV集结航路规划存在计算量大、耗时长的问题，提出了一种基于互惠速度障碍法（reciprocal velocity obstacle，RVO）的深度确定性策略梯度（deep deterministic policy gradient，DDPG）算法。引入互惠速度障碍法指导UAV对不确定环境内障碍进行避碰，有效提高了目标actor网络的收敛速度，增强了算法的学习效率。设计了一种基于综合代价的奖励函数，将多UAV航路规划中的多目标优化问题转化为DDPG算法的奖励函数设计问题，该设计有效解决了传统DDPG算法易产生局部最优解的问题。基于Pycharm软件平台通过仿真验证了该算法的性能，并与多种算法进行对比。仿真实验表明，RVO-DDPG算法具有更快的决策速度和更好的实用性。

关键词: 无人机, 航路规划, 编队集结, 深度确定性策略梯度算法（DDPG）, 互惠速度障碍法（RVO）

YANG Xiuxia, GAO Hengjie, LIU Wei, ZHANG Yi. Application of RVO-DDPG Algorithm in Multi-UAV Consolidation Route Planning[J]. Computer Engineering and Applications, 2023, 59(1): 308-316.

杨秀霞, 高恒杰, 刘伟, 张毅. RVO-DDPG算法在多UAV集结航路规划的应用[J]. 计算机工程与应用, 2023, 59(1): 308-316.

References

[1] 程旗，岳碧波.无人机自主编队的人工力场控制方法[J].兵器装备工程学报，2018，39（8）：88-91.
CHENG Q，YUE B B.Artificial potential fields control method of UAV autonomous formation[J].Journal of Ordnance Equipment Engineering，2018，39（8）：88-91.
[2] 邵壮.多无人机编队路径规划与队形控制技术研究[D].西安：西北工业大学，2017.
SHAO Z.Research on multi-UAV formation path planning and formation control technology[D].Xi’an：Northwestern Polytechnical University，2017.
[3] 邵壮，周洲，王彦雄，等.基于CPSO的UAV编队集结路径规划[J].飞行力学，2017，35（1）：61-65.
SHAO Z，ZHOU Z，WANG Y X，et al.Formation rendezvous path planning for multi-UAVs based on CPSO[J].Flight Mechanics，2017，35（1）：61-65.
[4] 朱学平，杨军，袁博，等.固定翼无人机编队集结控制算法研究[J].导航定位与授时，2020，7（5）：128-133.
ZHU X P，YANG J，YUAN B，et al.Research on formation control of multiple fixed-wing UAVs[J].Navigation，Positioning and Timing，2020，7（5）：128-133.
[5] WEI X.A new fast consensus algorithm applied in rendezvous of multi-uav[C]//Proceedings of 27th Chinese Control and Decision Conference，2015：55-60.
[6] CAO Y，LONG T，WANG Z，et al.An efficient decomposition-based cooperative path planning method for multiple UAVs[C]//Proceedings of 27th 2018 Aviation Technology，Integration，and Operations Conference，2018：3345.
[7] WANG Z，LIU L，LONG T，et al.Efficient unmanned aerial vehicle formation rendezvous trajectory planning using Dubins path and sequential convex programming[J].Engineering Optimization，2019，51（8）：1412-1429.
[8] JOUFFROY V，BOVIER-LAPIERRE X，ARIFF O K，et al.Path generation for rendezvous of dissimilar UAVs using particle swarm optimization of Dubin’s curve sets[C]//Proceedings of 27th AIAA Guidance，Navigation，and Control Conference，2016：1142.
[9] PAPEN A，VANDENHOECK R，BOLTING J，et al.Collision-free rendezvous maneuvers for formations of unmanned aerial vehicles[J].IFAC-Papers OnLine，2017，50（1）：282-289.
[10] YAO W，QI N，LIU Y.Online trajectory generation with rendezvous for UAVs using multistage path prediction[J].Journal of Aerospace Engineering，2017，30（3）：04016092.
[11] 赵太飞，宫春杰，张港，等.一种无人机集群安全高效的分区集结控制策略[J/OL].电子与信息学报：1-8（2020-12-17）[2021-06-15].http：//kns.cnki.net/kcms/detail/11.4494.TN.
20201217.0919.006.html.
CHENG T F，GONG C J，ZHANG G，et al.A safe and efficient partition assembly control strategy for UAV clusters[J/OL].Journal of Electronics and Information Technology：1-8（2020-12-17）[2021-06-15].http：//kns.cnki.net/kcms/detail/11.4494.TN.20201217.0919.006.html.
[12] 陈志旺，夏顺，李建雄，等.基于定向A*算法的多无人机同时集结分步策略[J].控制与决策，2019，34（6）：1169-1177.
CHEN Z W，XAI S，LI J X，et al.Serial strategy for rendezvous of multiple UAVS based on directional A* algorithm[J].Control and Decision，2019，34（6）：1169-1177.
[13] 李征，陈建伟，彭博.基于伪谱法的无人机集群飞行路径规划[J].空天防御，2021，4（1）：52-59.
LI Z，CHEN J W，PENG B.UAV cluster flight path planning based on pseudo-spectrum method[J].Aerospace Defense，2021，4（1）：52-59.
[14] 赵国荣，温家鑫，李晓宝，等.三段式无人机集结制导律设计[J].计算机仿真，2020，37（10）：64-68.
ZHAO G R，WEN J X，LI X B，et al.Design of three-stage guidance law for UAVs rendezvous[J].Computer Simulation，2020，37（10）：64-68.
[15] 王锦锦，祁圣君，钟海，等.基于Dubins曲线的一致性编队集结控制[J].计算机仿真，2021，38（7）：40-44.
WANG J J，QI S J，ZHONG H，et al.Consistent formation aggregation control based on Dubins curve[J].Computer Simulation，2021，38（7）：40-44.