运动场景下改进YOLOv5小目标检测算法

doi:10.3778/j.issn.1002-8331.2211-0017

摘要/Abstract

摘要： 针对运动场景下由于设备移动、相机散焦，导致采集到的图像模糊，图像质量低，以及目标体积小，使目标检测困难的问题，提出了一种改进YOLOv5x目标实时检测模型。采用可变形卷积网络替换部分原始YOLOv5x中传统的卷积层，增强模型在运动场景中细粒度特征提取和小目标检测能力；增加SE注意力机制，解决在卷积过程中，因丢失图像全局上下文信息，造成特征损失的问题，提高了模型在图像模糊情况下小目标的检测精度；引入一种新的边界框回归损失函数SIoU Loss，解决了预测框在回归时随意匹配的问题，提高了模型鲁棒性和泛化能力，加快网络的收敛速度。实验结果表明，相比于YOLOv5x模型，将改进后的算法应用在水下移动机器人生物检测中，模型准确率[P、]召回率[R、]各类平均精度mAP分别提升了5.90个百分点、5.85个百分点、4.38个百分点，有效增强了小目标检测模型的检测性能。

关键词: 可变形卷积网络, 注意力机制, SIoU Loss, YOLOv5x

Abstract: For the problems of blurred images and low image quality due to the movement of devices and camera scattering in motion scenes, as well as the small size of the object, which make object detection difficult, an improved YOLOv5x object detection model in real time is proposed. Firstly, deformable convolutional network is used to replace part of the traditional convolution layer in the original YOLOv5x to enhance the model’s ability of fine-grained feature extraction and small object detection in motion scenes. Secondly, the SE attention mechanism is added to solve the problem of feature loss caused by the loss of global context information in the process of convolution, which improves the detection accuracy of small objects in the case of image blur. Finally, a new bounding box regression loss function, SIoU Loss, is introduced to solve the problem of random matching of prediction boxes in regression, improve the robustness and generalization ability of the model, and accelerate the convergence speed of the network. The experimental results show that compared with the YOLOv5x model, the improved algorithm is applied to underwater mobile robot biological detection, and the improved model accuracy [P,] recall rate [R] and average accuracy mAP are improved by 5.90 percentage points, 5.85 percentage points and 4.38 percentage points, respectively, which effectively enhances the detection performance of the small object detection model.

Key words: deformable convolutional network, attention mechanism, SIoU Loss, YOLOv5x

朱瑞鑫, 杨福兴. 运动场景下改进YOLOv5小目标检测算法[J]. 计算机工程与应用, 2023, 59(10): 196-203.

ZHU Ruixin, YANG Fuxing. Improved YOLOv5 Small Object Detection Algorithm in Moving Scenes[J]. Computer Engineering and Applications, 2023, 59(10): 196-203.

参考文献

[1] 冯艳.动态背景下基于SIFT特征匹配的目标检测算法[D].西安：西安电子科技大学，2014.
FENG Y.Object detection algorithm based on SFIT feature matching in dynamic background[D].Xi’an：Xidian University，2014.
[2] 胡浩星.摄像机移动下的运动目标检测算法研究[D].重庆：重庆大学，2021.
HU H X.Study on moving object detection algorithm with the moving camera[D].Chongqing：Chongqing University，2021.
[3] SIMONYAN K，ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv：1409.
1556，2014.
[4] SZEGEDY C.Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition，Boston，2015.Piscataway：IEEE，2015：1-9.
[5] 邱忠宇.基于动态视觉传感器的目标检测与识别算法研究[D].哈尔滨：哈尔滨工业大学，2020.
QIU Z Y.Research on object detection and recognition algorithm based on dynamic visual sensor[D].Harbin：Harbin Institute of Technology，2020.
[6] 韩逸.移动场景下基于深度学习的车道线检测方法研究[D].株洲：湖南工业大学，2022.
HAN Y.Research on lane line detection method based on deep learning in mobile scenes[D].Zhuzhou：Hunan University of Technology，2022.
[7] 徐文辉.基于图像配准的动平台动目标检测方法与系统研究[D].武汉：华中科技大学，2020.
XU W H.Moving platform moving object detection based on image registration methods and system[D].Wuhan：Huazhong University of Science and Technology，2020.
[8] REDMON J，DIVVALA S，GIRSHICK R，et al.You only look once：unified，real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition，Las Vegas，2016.Piscataway：IEEE，2016：779-788.
[9] REN S，HE K，GIRSHICK R，et al.Faster R-CNN：towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2017，39（6）：1137-1149.
[10] REDMON J，FARHADI A.YOLOv3：an incremental improvement[J].arXiv：1804.02767，2018.
[11] BOCHKOVSKIY A，WANG C Y，LIAO H Y M.YOLOv4：optimal speed and accuracy of object detection[J].arXiv：2004.10934，2020.
[12] 虞志军，王国栋，张镡月.基于增强多尺度特征网络的图像去模糊[J].激光与光电子学进展，2022，59（22）：264-271.
YU Z J，WANG G D，ZHANG T Y.Image deblurring based on enhanced multi-scale feature network[J].Laser & Optoelectronics Progress，2022，59（22）：264-271.
[13] 刘颖，刘红燕，范九伦，等.基于深度学习的小目标检测研究与应用综述[J].电子学报，2020，48（3）：590-601.
LIU Y，LIU H Y，FAN J L，et al.Survey on research and application of small object detection based on deep learning[J].Acta Electronica Sinica，2020，48（3）：590-601.
[14] HU G X，YANG Z，HU L，et al.Small object detection with multiscale features[J].International Journal of Digital Multimedia Broadcasting，2018.DOI：10.1155/2018/4546896.
[15] DAI J F，QI H Z，XIONG Y W，et al.Deformable convolutional networks[C]//2017 IEEE International Conference on Computer Vision，Venice，2017.Piscataway：IEEE，2017：764-773.
[16] ZHU X，HU H，LIN S，et al.Deformable ConvNets V2：more deformable，better results[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition，Long Beach，2019.Piscataway：IEEE，2019：9300-9308.
[17] HU J，SHEN L，ALBANIE S，et al.Squeeze-and-excitation networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2020，42（8）：2011-2023.
[18] GEVORGYAN Z.SIoU Loss：more powerful learning for bounding box regression[J].arXiv：2205.12740，2022.
[19] ZHENG Z，WANG P，LIU W，et al.Distance-IoU loss：faster and better learning for bounding box regression[J].arXiv：1911.08287，2019.
[20] KINGMA D P，BA J.Adam：a method for stochastic optimization[J].arXiv：1412.6980，2014.
[21] LIU W，ANGUELOV D，ERHAN D，et al.SSD：single shot multibox detector[J].arXiv：1512.02325，2015.