Improved Lightweight Target Detection Algorithm for YOLOv4-tiny

doi:10.3778/j.issn.1002-8331.2207-0496

Abstract

Abstract: In order to solve the problems of slow feature extraction, insufficient detection real-time and poor algorithm portability in target detection deployed on embedded devices, a lightweight network,YOLOv4-tiny-CSPRDWConv， based on the CSPRDWConv module and improved Mosaic is proposed with YOLOv4-tiny as the benchmark network. The improved Mosaic data enhancement method saves time in the data enhancement process, makes full use of each image block, and filters out targets with too small objects, making the model easier to train. On top of this, all the convolutional layers of the backbone network are selected with small convolutional kernels, and only a 5×5 depth separable convolution is used in the last compression of feature map to ensure the low latency and high accuracy of the model，and then BN layers are fused to speed up the inference process of the model. The improved YOLOv4-tiny algorithm achieves a real-time detection speed of 1 308 FPS on 1080Ti hardware, and the inference speed on the RK3288 development board is about 8 FPS, which is nearly four times faster than YOLOv4-tinybenchmark network, and the mAP reaches 22.31%, an improvement of 0.61?percentage points in comparison with the benchmark network. Experimental results show that the improved YOLOv4-tiny algorithm provides smoother and more efficient detection on embedded devices.

Key words: object detection, YOLOv4-tiny, embedded systems, CSPRDWConv module, Mosaic data enhancement

摘要： 为解决部署在嵌入式设备上的目标检测中特征提取速度较慢、检测实时性不足和算法移植性较差的问题，以YOLOv4-tiny为基准网络，提出一种基于CSPRDWConv（cross stage partial residual depthwise convolution）模块的轻量级网络YOLOv4-tiny-CSPRDWConv，并使用改进的Mosaic数据增强来提升检测模型精度。CSPRDWConv模块中适当缩减算力规模，使得整个模块在保持精度的同时大幅提升推理速度；改进的Mosaic数据增强方法，节省数据增强进程的时间，充分利用每个图像块，并且过滤掉物体过小的目标，使得模型更易于训练。在此基础之上，主干网络的卷积层全部选用小卷积核，只在最后一次压缩特征图时使用5×5的深度可分离卷积，以确保模型低延迟和高准确度的特性；在Neck中引入弱SPP模块，利用局部特征和全局特征来提高目标检测的精度；通过NEON指令对训练后的检测模型进行优化，将卷积层与BN层融合，加快模型的推理进程。改进的YOLOv4-tiny算法在1080Ti的硬件上达到1?308?FPS的实时检测速度，在RK3288开发板上的推理速度约为8?FPS，检测速度约为YOLOv4-tiny基准网络的4倍；mAP达到22.31%，相比于基准网络提升0.61个百分点。实验结果表明，改进的YOLOv4-tiny算法在嵌入式设备上的检测效果更为流畅和高效。

关键词: 目标检测, YOLOv4-tiny, 嵌入式系统, CSPRDWConv模块, Mosaic数据增强

GUO Mingzhen, WANG Wei, SHEN Hongting, HOU Hongtao, LIU Kuan, LUO Zijiang. Improved Lightweight Target Detection Algorithm for YOLOv4-tiny[J]. Computer Engineering and Applications, 2023, 59(23): 145-153.

郭明镇, 汪威, 申红婷, 候红涛, 刘宽, 罗子江. 改进型YOLOv4-tiny的轻量级目标检测算法[J]. 计算机工程与应用, 2023, 59(23): 145-153.

References

[1] REDMON J，DIVVALA S，GIRSHICK R，et al.You only look once：unified，real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：779-788.
[2] NAUMAN S，VASSILIS K，MICHAEL B，et al.Robust principal component analysis on graphs[C]//IEEE International Conference on Computer Vision，2015：2812-2820.
[3] WANG C Y，BOCHKOVSKIY A，LIAO H Y M.Scaled-YOLOv4：scaling cross stage partial network[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recogntion（CVPR），2021.
[4] 王长清，贺坤宇，蒋帅.改进YOLOv4-tiny网络的狭小空间目标检测方法[J].计算机工程与应用，2022，58（10）：240-248.
WANG C Q，HE K Y，JIANG S.Narrow space object detection method by improved YOLOv4-tiny network[J].Computer Engineering and Applications，2022，58（10）：240-248.
[5] 朱杰，王建立，王斌.基于YOLOv4-tiny改进的轻量级口罩检测算法[J].液晶与显示，2021，36（11）：1525-1534.
ZHU J，WANG J L，WANG B.Light weight mask detection algorithm based on YOLOv4-tiny[J].Chinese Journal of Liquid Crystals and Displays，2021，36（11）：1525-1534.
[6] HE K，ZHANG X，REN S，et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2015，37（9）：1904-1916.
[7] SHU L，LU Q，HAIFENG Q，et al.Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：8759-8768.
[8] MüLLER R，KORNBLITH S，HINTON G.When does label smoothing help?[J].arXiv：1906.02629，2019.
[9] 卢迪，马文强.基于改进YOLOv4-tiny算法的手势识别[J].电子与信息学报，2021，43（11）：3257-3265.
LU D，MA W Q.Gesture recognition based on improved YOLOv4-tiny algorithm[J].Journal of Electronics & Information Technology，2021，43（11）：3257-3265.
[10] BOCHKOVSKIY A，WANG C Y，LIAO H Y M.Yolov4：optimal speed and accuracy of object detection[J].arXiv：2004.10934，2020.
[11] RUSSAKOVSKY O，DENG J，SU H，et al.ImageNet large scale visual recognition challenge[J].International Journal of Computer Vision，2015，115（3）：211-252.
[12] WOO S，PARK J，LEE J，et al.CBAM：convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision，2018：3-19．
[13] BOCHKOVSKIY A，WANG C Y，LIAO H Y.Yolov4：optimal speed and accuracy of object detection[J].arXiv：2004.10934，2020.
[14] LIN T Y，DOLLáR P，GIRSHICK R，et al.Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：2117-2125.
[15] GOLNAZ G，TSUNG Y，QUOC V L.Drop block：a regularization method for convolutional networks[C]//Advances in Neural Information Processing Systems，2018：10727-10737.
[16] HOWARD A G，ZHU M，CHEN B，et al.MobileNets：efficient convolutional neural networks for mobile vision applications[J].arXiv：1704.04861，2017.
[17] SANDLER M，HOWARD，A G，ZHU M，et al.Inverted residuals and linear bottlenecks：mobile networks for classification，detection and segmentation[J].arXiv：1801. 04381，2018.
[18] LONG X，DENG K，WANG G，et al.PP-YOLO：an effective and efficient implementation of object detector[J].arXiv：2007.12099，2020.
[19] WANG J，CHEN K，YANG S，et al.Region proposal by guided anchoring[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），2019：2960-2969.
[20] ZHENG Q，LI Z，ZHANG Z，et al.ThunderNet：towards real-time generic object detection on mobile devices[C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV），2019：6717-6726.
[21] HOWARD A，SANDLER M，CHEN B，et al.Searching for MobileNetV3[C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV），2019：1314-1324.