Research on Optimization of YOLOv5 Detection Algorithm for Object in Complex Road

doi:10.3778/j.issn.1002-8331.2305-0013

Abstract

Abstract: Aiming at the problems of false detection and high missed detection rate of small targets and occluded targets in complex road scenarios by existing object detection algorithms, DPE-YOLO, a complex road object detection model based on improved YOLOv5 is proposed. In terms of preset anchor boxes, a [K]-means+D clustering algorithm based on sample density is proposed in the improvement method to generate more effective preset anchor boxes, shorten the convergence path, and effectively improve the detection accuracy. In terms of feature extraction, the PAA module is designed to replace the C3 module in the original backbone network, and the module adopts the design of multi-gradient flow residual structure based on attention mechanism, which can improve the extraction ability of detailed information and improve the problem of missed and false detection of small road targets. Finally, in terms of positioning accuracy, EIOU loss is introduced to reduce the missed detection rate of the model for occlusion targets. Experimental data show that on KITTI dataset and Udacity dataset, the mean average precision（mAP） of the improved algorithm is increased by 2.8 percentage points and 1.6 percentage points, and the mAP@0.5：0.9 is increased by 2.7 percentage points and 2.9 percentage points, respectively, compared with the original algorithm. Experimental results show that DPE-YOLOv5 can effectively improve the detection performance of small targets and occluded targets in complex road scenarios, and can better meet the detection requirements in autonomous driving scenarios.

Key words: autonomous driving, clustering algorithm, multi-gradient flow, attention mechanism, loss function

摘要： 针对现有目标检测算法对复杂道路场景中小目标、遮挡目标的误检、漏检率较高等问题，提出了基于YOLOv5的复杂道路目标检测的改进模型DPE-YOLO。该改进方法在预设锚框方面，提出基于样本密度的[K]-means+D聚类算法，生成更有效的预设锚框，缩短收敛路径从而有效提高检测精度；在特征提取方面，设计了PAA模块代替原骨干网络中的C3模块，模块采用对基于注意力机制的多梯度流残差结构设计，可提升对细节信息的提取能力，改善对道路小目标的漏检、误检问题；在定位精度方面，引入EIOU loss，降低模型对遮挡目标的漏检率。实验数据显示，在KITTI数据集和Udacity数据集上，改进算法与原算法相比平均精度均值（mAP）分别提升了2.8个百分点和1.6个百分点，mAP@0.5：0.9分别提升了2.7个百分点和2.9个百分点。实验结果表明，DPE-YOLO有效提升了对复杂道路场景中小目标和遮挡目标的检测性能，能更好地满足自动驾驶场景中的目标检测需求。

关键词: 自动驾驶, 聚类算法, 多梯度流, 注意力机制, 损失函数

LIU Hui, LIU Xinman, LIU Dadong. Research on Optimization of YOLOv5 Detection Algorithm for Object in Complex Road[J]. Computer Engineering and Applications, 2023, 59(18): 207-217.

刘辉, 刘鑫满, 刘大东. 面向复杂道路目标检测的YOLOv5算法优化研究[J]. 计算机工程与应用, 2023, 59(18): 207-217.

References

[1] CHEN X，KUNDU K，ZHANG Z，et al.Monocular 3D object detection for autonomous dring[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，Las Vegas，Jun 27-30，2016.Piscataway，NJ：IEEE，2016：2147-2156.
[2] SONG S，XIAO J.Deep sliding shapes for amodal 3D object detection in RGB-D images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，Las Vegas，Jun 27-30，2016.Piscataway，NJ：IEEE，2016：808-816.
[3] CHEN X，KUNDU K，ZHU Y，et al.3D object proposals using stereo imagery for accurate object class detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2017，40（5）：1259-1272.
[4] LOWE D G.Distinctive image features from scale-invariant keypoints[C]//International Conference on Computer Vision，Barcelona，Nov 06-13，2011.Piscataway，NJ：IEEE，2012：2564-2571.
[5] DALAL N，TRIGGS B.Histograms of oriented gradients for human detection[C]//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition，San Diego，Jun 20-25，2005.Piscataway，NJ：IEEE，2005：886-893.
[6] ZHOU Y，TUZEL O.VoxelNet：end-to-end learning for point cloud based 3D object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，Salt Lake City，Jun 18-23，2018.Piscataway，NJ：IEEE，2018：4490-4499.
[7] YAN Y，MAO Y，LI B.SECOND：sparsely embedded convolutional detection[J].Sensors，2018，18（10）：3337.
[8] 曹磊，王强，史润佳，等.基于改进RPN的Faster-RCNN网络SAR图像车辆目标检测方法[J].东南大学学报（自然科学版），2021，51（1）：87-91.
CAO L，WANG Q，SHI R J，et al.Method for vehicle target detection on SAR image based on improved RPN in Faster-RCNN[J].Journal of Southeast University（Natural Science Edition），2021，51（1）：87-91.
[9] YIN Q，YANG W，RAN M，et al.FD-SSD：an improved SSD object detection algorithm based on feature fusion and dilated convolution[J].Signal Processing：Image Communication，2021，98：116402.
[10] 郁强，王宽，王海.一种多尺度YOLOv3的道路场景目标检测算法[J].江苏大学学报（自然科学版），2021，42（6）：628-633.
YU Q，WANG K，WANG H.A multiscale YOLOv3 detection algorithm of road scene object[J].Journal of Jiangsu University（Natural Science Edition），2021，42（6）：628-633.
[11] REDMON J，DIVVALA S，GIRSHICK R，et al.You only look once：unified，real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），Las Vegas，Jun 27-30，2016.Piscataway，NJ：IEEE，2016：779-788.
[12] REDMON J，FARHADI A.YOLO9000：better，faster，stronger[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），Jul 21-26，2017，Honolulu，HI，USA.Piscataway，NJ：IEEE，2017：6517-6525.
[13] REDMON J，FARHADI A.YOLOv3：an incremental improvement[EB/OL].（2018-04-08）[2023-04-18].https：//arxiv.org/abs/1804.02767.
[14] BOCHKOVSKIY A，WANG C Y，LIAO H M.YOLOv4：optimal speed and accuracy of object detection[EB/OL].（2020-04-05）[2023-04-18].https：//arxiv.org/pdf/2004.10934.
[15] WANG P，FU S，CAO X.Improved lightweight target detection algorithm for complex roads with YOLOv5[C]//2022 International Conference on Machine Learning and Intelligent Systems Engineering（MLISE），Guangzhou，China，Aug 05-07，2022.Piscataway，NJ：IEEE，2022：275-283.
[16] 郑玉珩，黄德启.改进MobileViT与YOLOv4的轻量化车辆检测网络[J].电子测量技术，2023，46（2）：175-183.
ZHENG Y Y，HUANG D Q.Lightweight vehicle detection network based on MobileViT and YOLOv4[J].Electronic Measurement Technology，2023，46（2）：175-183.
[17] LI F F，JIA D，KAI L.ImageNet：constructing a large-scale image database[J].Journal of Vision，2009，9（8）：1037-1037.
[18] LIU Y，LI B F.Bayesian hierarchical K-means clustering[J].Intelligent Data Analysis，2020，24（5）：977-992.
[19] ARTHUR D，VASSILVITSKII S.K-means++：the advantages of careful seeding[C]//Proceedings of the Eighteenth Annual ACMSIAM Symposium on Discrete Algorithms（SODA），Jan 07-09，2007.Philadelphia：SIAM，2007：1027-1035.
[20] GEIGER A，LENZ P，URTASUN R.Are we ready for autonomous driving? The KITTI vision benchmark suite[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），Providence，RI，USA，Jun 16-21，2012.Piscataway，NJ：IEEE，2012：3354-3361.
[21] BUYVAL A，GABDULLIN A，MASTAFIN R，et al.Realtime vehicle and pedestrian tracking for didi udacity self-driving car challenge[C]//2018 IEEE International Conference on Robotics and Automation（ICRA），Brisbane，May 21-25，2018.Piscataway，NJ：IEEE，2018：2064-2069.
[22] LEE Y，HWANG J W，LEE S，et al.An energy and GPU computation efficient backbone network for real-time object detection[C]//Proceedings of the 2019 IEEE/CVF Conference on Compuer Vision and Pattern Recognition Workshops，Long Beach，Jun 16-17，2019.Piscataway，NJ：IEEE，2019：756-760.
[23] HOWARD A G，ZHU M L，CHEN B，et al.MobileNets：efficient convolutional neural networks for mobile vision applications[J].arXiv.1704.04861，2017.
[24] ZHENG Z，WANG P，LIU W，et al.Distance-IoU loss：faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence，Berlin，German，Feb 7-12，2020.New York，NY：ACM，2020：12993-13000.
[25] ZHANG Y F，REN W，ZHANG Z，et al.Focal and efficient IoU loss for accurate bounding box regression[J].Neurocomputing，2022，506：146-157.