计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (3): 246-251.DOI: 10.3778/j.issn.1002-8331.1907-0366

• 工程与应用 • 上一篇    下一篇

支持强化学习RNSGA-II算法在航迹规划中应用

封硕,郑宝娟,陈文兴,张婷宇   

  1. 1.长安大学 工程机械学院,西安 710064
    2.长安大学 理学院,西安 710064
    3.武汉大学 数学与统计学院,武汉 430072
  • 出版日期:2020-02-01 发布日期:2020-01-20

RNSGA-II Algorithm Supporting Reinforcement Learning and Its Application in UAV Path Planning

FENG Shuo, ZHENG Baojuan, CHEN Wenxing, ZHANG Tingyu   

  1. 1.School of Construction Machinery, Chang’an University, Xi’an 710064, China
    2.School of Sciences, Chang’an University, Xi’an 710064, China
    3.School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China
  • Online:2020-02-01 Published:2020-01-20

摘要: 针对传统第二代非支配排序遗传算法(NSGA-II)求解无人机多目标三维航迹规划早熟收敛及多样性不足的局限性,提出了支持强化学习RNSGA-II算法。设置两个独立种群分别用NSGA-II算法独立演化,隔代在两种族之间迁徙,接着各种群进行寻优进化,根据种群多样性的变化运用强化学习算法动态地优化各种群间“迁徙”的比例参数,从而使进化过程保持种群多样性,一定程度上解决了收敛速度和全局收敛性之间的矛盾。仿真结果表明,RNSGA-II算法较单一NSGA-II收敛精度更高,解集具有更好的分布性和多样性。

关键词: 双种群, 迁徙, NSGA-II, 航迹规划, 强化学习

Abstract: In view of the limitation of traditional NSGA-II in solving multi-objective UAV 3d path planning, such as the problem of premature convergence and insufficient diversity, an RNSGA-II algorithm supporting reinforcement learning is proposed. Two single populations are set to search optimization with the NSGA-II algorithm respectively and every certain generation migrate  between two populations, then each population performs optimization evolution, according to the change of population diversity using reinforcement learning algorithm to dynamically optimize the proportion of  migration among the population of parameters, so that the evolutionary process to keep the population diversity, to some extent, the contradiction between the convergence speed and global convergence is solved. Simulation results show that the RNSGA-II algorithm not only has higher convergence accuracy than single NSGA-II algorithm but also the solution set has better distribution and diversity.

Key words: double populations, migration, NSGA-II, path planning, reinforcement learning