计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (24): 200-210.DOI: 10.3778/j.issn.1002-8331.2405-0019

• 图形图像处理 • 上一篇    下一篇

基于YOLOv8n的航拍图像小目标检测算法

齐向明,严萍萍,姜亮   

  1. 1.辽宁工程技术大学 软件学院,辽宁 葫芦岛 125100
    2.塔里木大学 信息工程学院,新疆 阿拉尔 843300
  • 出版日期:2024-12-15 发布日期:2024-12-12

Small Target Detection Algorithm for Aerial Images Based on YOLOv8n

QI Xiangming, YAN Pingping, JIANG Liang   

  1. 1.School of Software, Liaoning Technical University, Huludao, Liaoning 125100, China
    2.School of Information Engineering, Tarim University, Alaer, Xinjiang 843300, China
  • Online:2024-12-15 Published:2024-12-12

摘要: 针对航拍图像小目标检测中存在目标密集和相互遮挡问题,提出一种基于YOLOv8n的航拍图像小目标检测算法。在主干网络末段,置换C2f中Bottleneck为改进后的FasterNet,保持通道数并提升收敛速度;替换SPPF中 CBS激活函数SiLU为ReLU使输入负值置零,在CBS后引入SE注意力机制扩张感受野,保留更多小目标特征。输出端检测头前嵌入高效多尺度注意力机制EMA获取更多细节信息,进一步提高小目标关注度。将基线网络损失函数CIoU替换成Wise IoU,提供增益分配策略,专注普通质量锚框,提高网络泛化能力。在数据集VisDrone2021和RSOD上做消融实验和对比实验,相较于基线算法,mAP@0.5分别提升5.1和7.2个百分点,mAP@0.5:0.95分别提升4.4和2.1个百分点,表明检测精度指标显著提升;在公开数据集VOC2007+2012上做泛化实验,mAP@0.5提升3.8个百分点,表明具有良好的鲁棒性。

关键词: 航拍图像, 小目标检测, YOLOv8n, FasterNet, SPPF模块, 高效多尺度注意力机制(EMA), Wise IoU

Abstract: To address the issue of dense targets and mutual occlusion in small target detection for aerial images, this paper proposes a small target detection algorithm based on YOLOv8n for aerial images. The algorithm incorporates several key enhancements. Firstly, at the end of the backbone network, the Bottleneck is replaced in C2f with improved FasterNet, maintaining the number of channels while improving convergence speed. Secondly, the CBS activation function SiLU is replaced in SPPF with ReLU, setting the input negative value to zero, and then the SE attention mechanism is introduced to retain more small target features. Thirdly, the efficient multi-scale attention mechanism EMA is embeded in front of the detection head, obtaining more detailed information and enhancing small target attention. Finally, the baseline network loss function CIoU is replaced with Wise IoU, providing a gain allocation strategy that prioritizes common quality anchor frames and improving network generalization. Ablation and comparison experiments are conducted using the VisDrone2021 and RSOD datasets. Results show an increase in mAP@0.5 by 5.1 and 7.2 percentage points compared to baseline algorithms for each dataset. Additionally, mAP@0.5:0.95 improved by 4.4 and 2.1 percentage points, respectively. These findings demonstrate a notable enhancement in the accuracy of detection metrics. Generalization experiments on the publicly available dataset VOC2007+2012 show an improvement of 3.8 percentage points for mAP@0.5, demonstrating good robustness.

Key words: aerial images, small object detection, YOLOv8n, FasterNet, spatial pyramid pooling fast (SPPF), efficient multi-scale attention (EMA), Wise IoU