改进YOLOv8的道路缺陷检测算法

doi:10.3778/j.issn.1002-8331.2404-0288

摘要/Abstract

摘要： 道路在长期使用后路面会出现各种缺陷，未能及时侦测和修补这些缺陷可能严重缩短道路寿命并危害行车安全。因此，道路缺陷的即时检测是一项重要的任务。传统的检测方法存在检测速度慢，成本要求高的问题。为了解决这些问题，在YOLOv8的基础上提出了一种名为DML-YOLO新型道路检测算法，该算法在主干网络中加入MPCA（MultiPath coordinate attention）注意力机制，提高主干网络的特征提取能力，在此基础上提出了C2f-MPDC模块，动态调整感受野，提高检测能力；重新设计了网络的颈部结构，提出新的特征融合金字塔结构DFPN（diversity feature pyramid network），减小模型的体积并融合低层的特征图获得丰富的细节信息，提高检测小目标的成功率；设计一种轻量级共享卷积检测头（LSCD head），减少模型尺寸，提高检测效率。大量实验结果表明，DML-YOLO在RDD2022数据集和VOC2007数据集上平均检测精度mAP@0.5分别为89.6%和73.6%，优于其他测试模型，并且参数量和计算量相较于YOLOv8模型分别减少了32.37%和14.49%，更加适合部署在嵌入式系统、移动设备等计算资源受限和边缘计算的场景。

关键词: 多路聚合注意力机制, 道路检测, YOLOv8, 共享卷积

Abstract: Various defects can emerge on the road surface after prolonged use. Failing to promptly detect and repair these defects can significantly reduce the road’s lifespan and jeopardize driving safety. Consequently, real-time detection of road defects assumes paramount importance. However, traditional detection methods suffer from sluggish speed and hefty cost requirements. Hence, to tackle these challenges, a novel road detection algorithm called DML-YOLO is proposed, which builds upon the YOLOv8 framework. This algorithm integrates the MultiPath coordinate attention (MPCA) mechanism into the backbone network to enhance feature extraction. Additionally, the C2f-MPDC module is introduced to dynamically adjust the receptive field and improve detection capabilities. Furthermore, the network’s neck structure is redesigned, introducing a novel diversity feature pyramid network (DFPN) that reduces model size and fuses low-level feature maps to extract rich, detailed information and elevate the success rate of detecting small targets. Moreover, a lightweight shared convolutional detection head (LSCD head) is meticulously designed to enhance detection efficiency while reducing model size. Ultimately, extensive experimental results demonstrate that DML-YOLO achieves remarkable average detection precision, with mAP@0.5 scores of 89.6% on the RDD2022 dataset and 73.6% on the VOC2007 dataset, surpassing other models tested. Additionally, compared to the YOLOv8 model, DML-YOLO boasts a reduction of 32.37% in parameter count and 14.49% in computational workload, making it highly suitable for deployment in resource-constrained computing environments like embedded systems and mobile devices.

Key words: MultiPath coordinate attention, road detection, YOLOv8, shared convolutional

王雪秋, 高焕兵, 郏泽萌. 改进YOLOv8的道路缺陷检测算法[J]. 计算机工程与应用, 2024, 60(17): 179-190.

WANG Xueqiu, GAO Huanbing, JIA Zemeng. Improved Road Defect Detection Algorithm Based on YOLOv8[J]. Computer Engineering and Applications, 2024, 60(17): 179-190.

参考文献

[1] ARYA D, MAEDA H, GHOSH S K, et al. Deep learning-based road damage detection and classification for multiple countries[J]. Automation in Construction, 2021, 132: 103935.
[2] GIRSHICK R. Fast R-CNN[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015: 1440-1448.
[3] DAI J, LI Y, HE K, et al. R-FCN: object detection via region-based fully convolutional networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016: 379-387.
[4] HE K, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, 2017: 2980-2988.
[5] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018.
[6] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[J]. arXiv:2004.10934, 2020.
[7] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//Proceedings of the European Conference on Computer Vision, 2016: 21-37.
[8] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327.
[9] CORD A, CHAMBON S. Automatic road defect detection by textural pattern recognition based on AdaBoost[J]. Computer-Aided Civil and Infrastructure Engineering, 2012, 27(4): 244-259.
[10] LI Y, YIN C, LEI Y, et al. RDD-YOLO: road damage detection algorithm based on improved you only look once version 8[J]. Applied Sciences, 2024, 14(8): 3360.
[11] 王海群, 王炳楠, 葛超. 重参数化YOLOv8路面病害检测算法[J]. 计算机工程与应用2024, 60(5): 191-199.
WANG H Y, WANG B N, GE C. Re-parameterized YOLOv8 pavement disease detection algorithm[J]. Computer Engineering and Applications, 2024, 60(5): 191-199.
[12] 李松, 史涛, 井方科. 改进YOLOv8的道路损伤检测算法 [J]. 计算机工程与应用2023, 59(23): 165-174.
LI S, SHI T, JING F K. Improved road damage detection algorithm of YOLOv8[J]. Computer Engineering and Applications, 2023, 59(23): 165-174.
[13] DONG H, SONG K, HE Y, et al. PGA-Net: pyramid feature fusion and global context attention network for automated surface defect detection[J]. IEEE Transactions on Industrial Informatics, 2020, 16(12): 7448-7458.
[14] ZHANG Y, ZUO Z, XU X, et al. Road damage detection using UAV images based on multi-level attention mechanism[J]. Automation in Construction, 2022, 144: 104613.
[15] SU P, HAN H, LIU M, et al. MOD-YOLO: rethinking the YOLO architecture at the level of feature information and applying it to crack detection[J]. Expert Systems with Applications, 2024, 237: 121346.
[16] LIN T, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 21-26.
[17] LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 18-23.
[18] ZHU X, HU H, LIN S, et al. Deformable convnets v2: more deformable, better results[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
[19] ARYA D, MAEDA H, GHOSH S K, et al. RDD2022: a multi-national image dataset for automatic road damage detection[J]. arXiv:2209.08538, 2022.
[20] HOU Q, ZHOU D, FENG J S. Coordinate attention for efficient mobile network design[J]. arXiv:2103.02907, 2021.
[21] HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.
[22] 杨杰, 蒋严宣, 熊欣燕. 结合Transformer和SimAM轻量化路面损伤检测算法[J/OL]. 铁道科学与工程学报: 1-10[2024-03-25]. https://doi.org/10.19713/j.cnki.43-1423/u.T20232012.
YANG J, JIANG Y X, XIONG X Y. Combining Transformer and SimAM lightweight pavement damage detection algorithms[J]. Journal of Railway Science and Engineering: 1-10[2024-03-25]. https://doi.org/10.19713/j.cnki.43-1423/u.T20232012.
[23] 陈伟, 江志成, 田子建, 等. 基于YOLOv8的煤矿井下人员不安全动作检测算法[J]. 煤炭科学技术: 1-19[2024-03-25]. http://kns.cnki.net/kcms/detail/11.2402.td.20240322.1343.
003.html.
CHEN W, JIANG Z C, TIAN Z J, et al. Unsafe action detection algorithm of underground personnel in coal mine based on YOLOv8[J]. Coal Science and Technology: 1-19[2024-03-25]. http://kns.cnki.net/kcms/detail/11.2402.td.20240322.
1343.003.html.
[24] ZHU L, WANG X, KE Z, et al. BiFormer: vision transformer with bi-level routing attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 10323-10333.
[25] 赵鑫, 陈里里, 杨维川, 等. DY-YOLOv5: 基于多重注意力机制的航拍图像目标检测[J]. 计算机工程与应用, 2024, 60(7): 183-191.
ZHAO X, CHEN L L, YANG W C, et al. DY-YOLOv5: target detection for aerial image based on multiple attention[J]. Computer Engineering and Applications, 2024, 60(7): 183-191.
[26] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 7464-7475.
[27] LI C, LI L, JIANG H, et al. YOLOv6: a single-stage object detection framework for industrial applications[J]. arXiv:2209.02976, 2022.
[28] 付锦燚, 张自嘉, 孙伟, 等. 改进YOLOv8的航拍图像小目标检测算法[J]. 计算机工程与应用, 2024, 60(6): 100-109.
FU J Y, ZHANG Z J, SUN W, et al. Improved YOLOv8 small target detection algorithm in aerial images[J]. Computer Engineering and Applications, 2024, 60(6): 100-109.
[29] GUO G, ZHANG Z. Road damage detection algorithm for improved YOLOv5[J]. Scientific Reports, 2022, 12(1): 15523.
[30] PHAM V, NGUYEN D, DONAN C. Road damage detection and classification with YOLOv7[C]//Proceedings of the 2022 IEEE International Conference on Big Data, 2022: 6416-6423.
[31] LV W Y, XU S L, ZHAO Y A, et al. DETRs beat YOLOs on real-time object detection[J]. arXiv:2304.08069, 2023.
[32] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, 2017: 22-29.