重构SPPCSPC与优化下采样的小目标检测算法

doi:10.3778/j.issn.1002-8331.2305-0004

摘要/Abstract

摘要： 针对小目标图像检测中存在相互遮挡、背景复杂和特征点少的问题，基于YOLOv7提出一种重构SPPCSPC与优化下采样的小目标检测算法。在骨干网络的SPPCSPC模块中裁剪CBS层、引入SimAM注意力机制并缩小池化核，以提高关注密集目标区域，提取更多相互遮挡的小目标特征；在颈部网络中，将下采样结构中的SConv替换为SPD Conv，再添加一个四倍下采样分支，以减少小目标特征丢失，提高复杂背景下小目标特征捕获量；把网络模型的损失函数由CIoU替换为Wise IoU，聚焦一般质量瞄框，提升收敛速度。在公开数据集VisDrone2021上做对比实验和消融实验，该算法与原始YOLOv7算法相比，mAP提升5.09个百分点，FPS值达到40，参数量减少2.5?MB，表明小目标检测精度显著提升，同时保持了推理速度并减少了参数量；在公开数据集VOC2007+2012上做泛化实验，mAP提升3.35个百分点，表明该算法具有通用性。

关键词: 小目标检测, 重构SPPCSPC, 优化下采样, Wise IoU, YOLOv7

Abstract: A detection algorithm is proposed of reconstructed SPPCSPC and optimized downsampling for small objects based on YOLOv7. This algorithm aims to address the challenges of detecting small objects in images, including mutual occlusion, complex backgrounds, and a limited number of feature points. To improve the detection of densely packed small objects, enhancements in the concerned dense target area are made, including cropping the CBS layer, introducing the SimAM attention mechanism, and reducing the pooling core in the SPPCSPC module of the backbone network. These modifications allow for better feature extraction of small targets that are mutually occluded. In the neck network, the SConv in the down-sampling structure is replaced by the SPD Conv and adds a quadruple down-sampling branch. These changes reduce feature loss and increase the capturing of small target features in complex backgrounds. Additionally, the Wise IoU loss function of the network model is substituted for CIoU, which focuses on the general quality frame and improves the convergence speed. Comparative and ablation experiments are conducted on the public dataset VisDrone2021, where the article increases mAP by 5.09 percentage points, achieves an FPS value of 40 and reduces the parameter count by 2.5 MB compared to the original YOLOv7 algorithm. It clearly illustrates that the modified algorithm significantly improves detection accuracy while maintaining fast inference speed and reducing the number of parameters. Furthermore, a generalization experiment is performed on the public dataset VOC2007+2012 where the mAP increased by 3.35 percentage points, indicating that the improved algorithm is versatile and can be applied to a wide range of scenarios.

Key words: small target detection, reconstructed SPPCSPC, optimized downsampling, Wise IoU, YOLOv7

齐向明, 柴蕊, 高一萌. 重构SPPCSPC与优化下采样的小目标检测算法[J]. 计算机工程与应用, 2023, 59(20): 158-166.

QI Xiangming, CHAI Rui, GAO Yimeng. Algorithm of Reconstructed SPPCSPC and Optimized Downsampling for Small Object Detection[J]. Computer Engineering and Applications, 2023, 59(20): 158-166.

参考文献

[1] LIU Y，SUN P，WERGELES N，et al.A survey and performance evaluation of deep learning methods for small object detection[J].Expert Systems with Applications，2021，172：114602.
[2] LIU W，ANGUELOV D，ERHAN D，et al.SSD：single shot multibox detector[C]//Computer Vision-ECCV 2016：14th European Conference，Amsterdam，The Netherlands，October 11-14，2016：21-37.
[3] REDMON J，DIVVALA S，GIRSHICK R，et al.You only look once：unified，real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：779-788.
[4] REDMON J，FARHADI A.YOLO9000：better，faster，stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：7263-7271.
[5] REDMON J，FARHADI A.Yolov3：an incremental improvement[J].arXiv：1804.02767，2018.
[6] BOCHKOVSKIY A，WANG C Y，LIAO H Y M.Yolov4：optimal speed and accuracy of object detection[J].arXiv：2004.10934，2020.
[7] WANG C Y，BOCHKOVSKIY A，LIAO H Y M.YOLOv7：trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2023：7464-7475.
[8] 王建军，魏江，梅少辉，等.面向遥感图像小目标检测的改进YOLOv3算法[J].计算机工程与应用，2021，57（20）：133-141.
WANG J J，WEI J，MEI S H，et al.Improved YOLOv3 for small object detection in remote sensing images[J].Computer Engineering and Applications，2021，57（20）：133-141.
[9] 陈欣，万敏杰，马超，等.采用多尺度特征融合SSD的遥感图像小目标检测[J].光学精密工程，2021，29（11）：2672-2682.
CHEN X，WAN M J，MA C，et al.Recognition of small targets in remote sensing image using multi-scale feature fusion-based shot multi-box detector[J].Optics and Precision Engineering，2021，29（11）：2672-2682.
[10] ZHU X，LYU S，WANG X，et al.TPH-YOLOv5：improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision，2021：2778-2788.
[11] 蒋镕圻，彭月平，谢文宣，等.嵌入scSE模块的改进YOLOv4小目标检测算法[J].图学学报，2021，42（4）：546-555.
JIANG R Q，PENG Y P，XIE W X，et al.Improved YOLOv4 small target detection algorithm with embedded scSE module[J].Journal of Graphics，2021，42（4）：546-555.
[12] GE Z，LIU S，WANG F，et al.Yolox：exceeding yolo series in 2021[J].arXiv：2107.08430，2021.
[13] YANG C，HUANG Z，WANG N.Querydet：cascaded sparse query for accelerating high-resolution small object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2022：13668-13677.
[14] 赵鹏飞，谢林柏，彭力.融合注意力机制的深层次小目标检测算法[J].计算机科学与探索，2022，16（4）：927-937.
ZHAO P F，XIE L B，PENG L.Deep small object detection algorithm integrating attention mechanism[J].Journal of Frontiers of Computer Science and Technology，2022，16（4）：927-937.
[15] ZHANG X，FENG Y，ZHANG S，et al.Finding nonrigid tiny person with densely cropped and local attention object detector networks in low-altitude aerial images[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing，2022，15：4371-4385.
[16] 肖进胜，赵陶，周剑，等.基于上下文增强和特征提纯的小目标检测网络[J].计算机研究与发展，2023，60（2）：465-474.
XIAO J S，ZHAO T，ZHOU J，et al.Small target detection network based on context augmentation and feature refinement[J].Journal of Computer Research and Development，2023，60（2）：465-474.
[17] 李子豪，王正平，贺云涛，等.基于自适应协同注意力机制的航拍密集小目标检测算法[J].航空学报，2023，44（13）：239-249.
LI Z H，WANG Z P，HE Y T，et al.Aerial photography dense small target detection algorithm based on adaptive cooperative attention mechanism[J].Acta Aeronautica et Astronautica Sinica，2023，44（13）：239-249.
[18] 宋怀波，马宝玲，尚钰莹，等.基于YOLOv7-ECA模型的苹果幼果检测[J].农业机械学报，2023，54（6）：233-242.
SONG H B，MA B L，SHANG Y Y，et al.Detection of young apple fruits based on the YOLOv7-ECA model[J].Transactions of the Chinese Society for Agricultural Machinery，2023，54（6）：233-242.
[19] 贾天豪，彭力，戴菲菲.引入残差学习与多尺度特征增强的目标检测器[J].计算机科学与探索，2023，17（5）：1102-1111.
JIA T H，PENG L，DAI F F.Object detector with residual learning and multi-scale feature enhancement[J].Journal of Frontiers of Computer Science and Technology，2023，17（5）：1102-1111.
[20] ZHAO H，ZHANG H，ZHAO Y.Yolov7-sea：object detection of maritime UAV images based on improved yolov7[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision，2023：233-238.
[21] 王晓红，胡豫.复杂背景下的无人机图像小目标检测[J].计算机工程与应用，2023，59（15）：107-114.
WANG X H，HU Y.UAV image small object detection on a complex background[J].Computer Engineering and Applications，2023，59（15）：107-114.
[22] DING X，ZHANG X，MA N，et al.RepVGG：making VGG-style convnets great again[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2021：13733-13742.
[23] LIN T Y，DOLLAR P，GIRSHICK R，et al.Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：2117-2125.
[24] LIU S，QI L，QIN H，et al.Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：8759-8768.
[25] YANG L，ZHANG R Y，LI L，et al.SimAM：a simple，parameter-free attention module for convolutional neural networks[C]//International Conference on Machine Learning，2021：11863-11874.
[26] SUNKARA R，LUO T.No more strided convolutions or pooling：a new CNN building block for low-resolution images and small objects[J].arXiv：2208.03641，2022.
[27] TONG Z，CHEN Y，XU Z，et al.Wise-IoU：bounding box regression loss with dynamic focusing mechanism[J].arXiv：2301.10051，2023.
[28] REZATOFIGHI H，TSOI N，GWAK J Y，et al.Generalized intersection over union：a metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：658-666.
[29] ZHENG Z，WANG P，LIU W，et al.Distance-IoU loss：faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2020：12993-13000.
[30] ZHANG Y F，REN W，ZHANG Z，et al.Focal and efficient IOU loss for accurate bounding box regression[J].Neurocomputing，2022，506：146-157.