计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (20): 328-338.DOI: 10.3778/j.issn.1002-8331.2306-0311

• 工程与应用 • 上一篇    

RISE-D3QN驱动的多无人机数据采集路径规划

黄泽丰,李涛   

  1. 南京信息工程大学 自动化院,南京 210044
  • 出版日期:2024-10-15 发布日期:2024-10-15

RISE-D3QN-Based Path Planning for Multi-UAV Data Collection

HUANG Zefeng, LI Tao   

  1. School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, China
  • Online:2024-10-15 Published:2024-10-15

摘要: 无人机辅助物联网数据采集是高效且具有前景的方法。针对路径规划的优化资源分配问题,细化了电量消耗模型,并考虑了三个指标:数据量、时间效率和能源效率。该问题被建模为分布式局部可观测马尔可夫决策过程,并提出一种深度强化学习算法。具体地,将归一化的模型分为四个具体地的无人机电量消耗模型;基于离散动作离线深度强化学习架构,提出一种新的RISE(Rényi state entropy)-D3QN(dueling double deep Q network)算法,结合了内在奖励、优先经验回放和soft-max探索策略,可在无人机电池容量、物联网设备位置、物联网设备数据量、物联网设备数量发生变化的同时规划无人机群的路径。仿真结果表明,相比于传统的D3QN算法以及传统的DQN算法,在确保无人机安全飞行的同时,提高了无人机从物联网设备采集的数据量,并在以此为主要目标的情况下减少了无人机的飞行时间以及能量消耗。

关键词: 无人机, 路径规划, 深度强化学习, 多智能体, 物联网, 数据采集

Abstract: Unmanned aerial vehicles (UAVs) assisted Internet of things (IoT) data collection is an efficient and promising approach. The optimization of resource allocation in path planning is addressed in this paper by refining the energy consumption model and considering three metrics:the amount of collected data, time efficiency, and energy efficiency. The problem is formulated as a distributed partially observable Markov decision process (POMDP) and a novel deep reinforcement learning algorithm called RISE (Rényi state entropy)-D3QN (dueling double deep Q network) is proposed. It combines intrinsic rewards, prioritized experience replay, and soft-max exploration strategy, enabling path planning for UAV swarms while adapting to changes in UAV battery capacity, IoT device locations, data volume, and quantity. Simulation results demonstrate that compared to traditional D3QN and DQN algorithms, the proposed approach significantly increases. the amount of collected data from IoT devices while reducing UAV flight time and energy consumption, all while ensuring UAV safety during flight.

Key words: unmanned aerial vehicles (UAVs), path planning, deep reinforcement learning, multi-agent, Internet of things (IoT), data collection