Improved YOLOv8 Small Target Detection Algorithm in Aerial Images

doi:10.3778/j.issn.1002-8331.2311-0281

Abstract

Abstract: In aerial image detection task, object and the overall image size are small, scales have different characteristics and detail information is not clear, it can cause leak and mistakenly identified problems, an improved small target detection algorithm CA-YOLOv8 is proposed. Channel feature partial convolution (CFPConv) is designed. Based on this, it reconstructs a Bottleneck structure in C2f, which is named CFP_C2f. In this way, some C2f modules in YOLOv8 head and neck are replaced, the effective channel feature weights are enhanced, and the ability to obtain multi-scale detail features is improved. A context aggregated module (CAM) is embedded to improve the context aggregation ability, optimize the response of feature channels, and strengthen the ability to perceive the details of deep features. The NWD loss function is added and combined with CIoU as a positioning regression loss function to reduce the sensitivity of position bias. By making full use of the advantages of multiple attention mechanism, the original detection head is replaced with DyHead (dynamic head). In the experiment of VisDrone2019 dataset, the improved algorithm reduces the number of parameters by 33.3% compared with the original YOLOv8s model, and the detection accuracy of mAP50 and mAP50：95 increases by 8.7 and 5.7 percentage points respectively, showing good performance and confirming its effectiveness.

Key words: small target detection, YOLOv8 algorithm, feature channel fusion, multiple attention

摘要： 针对在航拍图像检测任务中，物体和整体图像尺寸都比较小，尺度特征不一和细节信息不清晰，会造成漏检和误检等问题，提出了一种改进小目标检测算法CA-YOLOv8。设计了一种通道特征部分卷积模块CFPConv（channel feature partial convolution），基于此重新构造了C2f中的Bottleneck结构，命名为CFP_C2f，从而替换YOLOv8头部和颈部的部分C2f模块，增强有效通道特征权值，提升多尺度细节特征的获取能力。嵌入一种用以提升上下文聚合能力的模块CAM（context aggregated module），优化特征通道的响应，强化对深层特征的细节感知能力。添加NWD损失函数，将其与CIoU结合作为定位回归损失函数，降低位置偏差的敏感性。充分运用多重注意力机制的优势，把原有检测头替换为DyHead（dynamic head）。在VisDrone2019数据集的实验中，改进的算法较YOLOv8s原模型参数量降低了33.3%，检测精度mAP50值和mAP50：95分别提升了8.7和5.7个百分点，表现出良好的性能，验证了其有效性。

关键词: 小目标检测, YOLOv8算法, 特征通道融合, 多重注意力

FU Jinyi, ZHANG Zijia, SUN Wei, ZOU Kaixin. Improved YOLOv8 Small Target Detection Algorithm in Aerial Images[J]. Computer Engineering and Applications, 2024, 60(6): 100-109.

付锦燚, 张自嘉, 孙伟, 邹凯鑫. 改进YOLOv8的航拍图像小目标检测算法[J]. 计算机工程与应用, 2024, 60(6): 100-109.

References

[1] PENG C, ZHU M, REN H, et al. Small object detection method based on weighted feature fusion and CSMA attention module[J]. Electronics, 2022, 11(16): 25-46.
[2] ZHANG Q, ZHANG H Y, LU X W. Adaptive feature fusion for small object detection[J]. Applied Sciences, 2022, 12(22): 11854.
[3] 陈朋磊, 王江涛, 张志伟, 等. 基于特征聚合与多元协同特征交互的航拍图像小目标检测[J].电子测量与仪器学报, 2023, 37(10): 183-192.
CHEN P L, WANG J T, ZHANG Z W, et al. Small object detection in aerial images based on feature aggregation and multiple cooperative features interaction[J]. Journal of Electronic Measurement and Instrumentation, 2023, 37(10): 183-192.
[4] ZHANG J Q, LEI J, XIE W Y, et al. SuperYOLO: super resolution assisted object detection in multimodal remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 1-15.
[5] MAKTAB D O M, RAZAAK M, REMAGNINO P. Enhanced single shot small object detector for aerial imagery using super-resolution, feature fusion and deconvolution[J]. Sensors, 2022, 22(12): 4339.
[6] LIU Z, GAO X H, WAN Y, et al. An improved YOLOv5 method for small object detection in UAV capture scenes[J]. IEEE Access, 2023, 11: 14365-14374.
[7] 齐向明, 柴蕊, 高一萌.重构SPPCSPC与优化下采样的小目标检测算法[J].计算机工程与应用, 2023, 59(20): 158-166.
QI X M, CHAI R, GAO Y M. Algorithm of reconstructed SPPCSPC and optimized downsampling for small object detection[J]. Computer Engineering and Applications, 2023, 59(20): 158-166.
[8] KIM M, JEONG J, KIM S. ECAP-YOLO: efficient channel attention pyramid YOLO for small object detection in aerial image[J]. Remote Sensing, 2021, 13(23): 4851.
[9] WANG G, CHEN Y F, AN P, et al. UAV-YOLOv8: a small-object-detection model based on improved YOLOv8 for UAV aerial photography scenarios[J]. Sensors, 2023, 23(16): 7190.
[10] ZHENG Z H, WANG P, REN D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2020, 52: 8574-8586.
[11] FENG C J, ZHONG Y J, GAO Y, et al. TOOD: task-aligned one-stage object detection[J]. IEEE/CVF International Conference on Computer Vision (ICCV), 2021: 3490-3499.
[12] CHEN G, WANG H T, CHEN K, et al. A survey of the four pillars for small object detection: multiscale representation, contextual information, super-resolution, and region proposal[J]. IEEE Transactions on Systems, 2022, 52(2): 936-953.
[13] 梁嘉杰, 李星星. 特定任务上下文解耦的遥感图像目标检测方法[J/OL].计算机工程与应用: 1-12(2023-10-25)[2023-12-03].http: //kns.cnki.net/kcms/detail/11.2127.TP.20231024.
1615.004.html.
LIANG J J, LI X X. Task-specific context decoupling object detection method for remote sensing images[J/OL].Computer Engineering and Applications: 1-12(2023-10-25)[2023-12-03].http://kns.cnki.net/kcms/detail/11.2127.TP.20231024.1615.
004.html.
[14] 何儒汉, 熊捷繁, 熊明福.基于背景自适应学习的行人重识别算法研究[J].计算机工程与应用, 2023, 59(7): 126-133.
HE R H, XIONG J F, XIONG M F. Research on person re-identification based on background adaptive learning[J]. Computer Engineering and Applications, 2023, 59(7): 126-133.
[15] LIU Y, LI H F, HU C, et al. Learning to aggregate multi-scale context for instance segmentation in remote sensing images[J]. arXiv:2111.11057, 2021.
[16] 张红民, 庄旭, 郑敬添, 等.优化YOLO网络的人体异常行为检测方法[J].计算机工程与应用, 2023, 59(7): 242-249.
ZHANG H M, ZHUANG X, ZHENG J T, et al. Optimizing human abnormal behavior detection method of YOLO network[J]. Computer Engineering and Applications, 2023, 59(7): 242-249.
[17] WANG J W, XU C, YANG W, et al. A normalized Gaussian wasserstein distance for tiny object detection[J]. arXiv:2110.13389, 2021.
[18] DAI X Y, CHEN Y P, XIAO B, et al. Dynamic head: unifying object detection heads with attentions[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021: 7369-7378.
[19] 赵洋.基于深度学习的道路环境感知研究[D].石家庄: 石家庄铁道大学, 2023.
ZHAO Y. Research on road environment perception based on deep learning[D]. Shijiazhuang: Shijiazhuang Tiedao University, 2023.
[20] 王可铮, 徐玉芬, 周尚波.结合对比感知损失和融合注意力的图像去雾模型[J].计算机工程, 2023, 49(8): 207-214.
WANG K Z, XU Y F, ZHOU S B. Image dehazing model combined with contrastive perceptual loss and fusion attention[J]. Computer Engineering, 2023, 49(8): 207-214.
[21] 沈轶杰, 李良澄, 刘子威, 等.基于单“音频像素”扰动的说话人识别隐蔽攻击[J]. 计算机研究与发展, 2021, 58(11): 2350-2363.
SHEN Y J, LI L C, LIU Z W, et al. Stealthy attack towards speaker recognition based on one-“Audio Pixel” perturbation[J]. Journal of Computer Research and Development, 2021, 58(11): 2350-2363.
[22] MA S, XU Y. MPDIoU: a loss for efficient and accurate bounding box regression[J]. arXiv:2307.07662, 2023.
[23] TONG Z J, CHEN Y H, XU Z W, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[J]. arXiv:2301.10051, 2023.
[24] CHEN N Y, LI Y, YANG Z M, et al. LODNU: lightweight object detection network in UAV vision[J]. The Journal of Supercomputing, 2023, 79: 10117-10138.
[25] LI C Y, LI L L, JIANG H L, et al. YOLOv6: a single-stage object detection framework for industrial applications[J]. arXiv:2209.02976, 2022.
[26] WANG C Y, BOCHKOVSKIY A, LIAO H. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022: 7464-7475.
[27] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[28] ZHANG Y Z, WU C Y, ZHANG T, et al. Self-attention guidance and multiscale feature fusion-based UAV image object detection[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 1-5.
[29] WANG J J, YU J, HE Z. ARFP: a novel adaptive recursive feature pyramid for object detection in aerial images[J]. Applied Intelligence, 2022, 52: 12844-12859.
[30] CAI Z S, HONG Z Y, YU W H, et al. CNXResNet: a light-weight backbone based on PP-YOLOE for drone-captured scenarios[C]//International Conference on Signal and Image Processing (ICSIP), 2023: 460-464.
[31] ZHANG Z. Drone-YOLO: an efficient neural network method for target detection in drone images[J]. Drones, 2023, 7(8): 526.
[32] CHEN H N, LIU H Y, SUN T, et al. MC-YOLOv5: a multi-class small object detection algorithm[J]. Biomimetics, 2023, 8(4): 342.