计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (14): 135-147.DOI: 10.3778/j.issn.1002-8331.2412-0332

• 目标检测专题 • 上一篇    下一篇

跨尺度特征融合的无人机小目标检测算法

罗显志,汪航   

  1. 湖北大学 人工智能学院,武汉 430062
  • 出版日期:2025-07-15 发布日期:2025-07-15

Small Target Detection Algorithm for UAV Based on Cross-Scale Feature Fusion

LUO Xianzhi, WANG Hang   

  1. School of Artificial Intelligence, Hubei University, Wuhan 430062, China
  • Online:2025-07-15 Published:2025-07-15

摘要: 针对目前无人机小目标检测任务中存在的多尺度、目标小、复杂背景干扰、易受遮挡等问题,提出一种复合特征融合与跨尺度优化的YOLOv11n改进模型,在骨干网络中提出MPCA(multiscale perceptual cascade attention)机制改进卷积模块,解决传统卷积特征表达能力不足的同时,在较低计算成本下显著提升网络的特征提取能力;提出全新的EMSFPN(efficient multi-scale FPN)结构改进颈部网络,使不同层级的特征得以相互融合。在改进颈部网络模型的基础上,增加具有丰富小目标语义信息的特征层;使用SBA(selective boundary aggregation)模块对多分辨率特征进行交互融合,提升模型的多尺度处理能力;引用Inner-IoU损失函数的思想改进Wise-IoU函数,用Inner-WIoU替代原损失函数,提升对小目标的定位精度,优化损失值计算。改进后YOLOv11n算法在VisDrone2019数据集上相对原始模型参数量减少9.8%,mAP50显著提升了9.1个百分点,性能超过YOLOv11s,在实现模型轻量化的同时,大幅度提升了性能。

关键词: YOLOv11n, 无人机(UAV), 小目标检测, 多尺度特征融合, Inner-WIoU

Abstract: Aiming at the existing problems of multi-scale, small target, complex background interference and easy occlusion in UAV small target detection tasks, an improved YOLOv11n model based on composite feature fusion and cross-scale optimization is proposed. In the backbone network, a multiscale perceptual cascade attention (MPCA) mechanism is proposed to improve the convolutional module, which addresses the lack of traditional convolutional feature expression ability, and significantly improves the feature extraction ability of the network at a lower computing cost. A new efficient multi-scale FPN (EMSFPN) structure is proposed to improve the neck network, enabling mutual integration of features from different levels. On the basis of improving the neck network model, a feature layer with rich semantic information of small targets is added. The selective boundary aggregation (SBA) module is used for interactive fusion of multi-resolution features to improve the multi-scale processing capability of the model. The Inner-IoU loss function is introduced to enhance the Wise-IoU function by replacing the original loss function with Inner-WIoU, improving the positioning accuracy of small targets, and optimizing the calculation of loss value. The improved YOLOv11n algorithm has a 9.8% reduction in the number of parameters compared with the original model on the VisDrone2019 data set, and a significant 9.1 percentage points improvement in mAP50. The performance exceeds that of YOLOv11s, and the performance is greatly improved while the model is lightweight.

Key words: YOLOv11n, unmanned aerial vehicle (UAV), small target detection, multi-scale feature fusion, Inner-WIoU