计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (16): 186-197.DOI: 10.3778/j.issn.1002-8331.2403-0383

• 图形图像处理 • 上一篇    下一篇

改进YOLOv8n的道路目标检测算法

高德勇,陈泰达,缪兰   

  1. 1.兰州交通大学 电子与信息工程学院,兰州 730070
    2.甘肃省人工智能与图形图像工程研究中心,兰州 730070
  • 出版日期:2024-08-15 发布日期:2024-08-15

Improved Road Object Detection Algorithm for YOLOv8n

GAO Deyong, CHEN Taida, MIAO Lan   

  1. 1.School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
    2.Gansu Provincial Engineering Research Center for Artificial Intelligence and Graphic and Image Processing, Lanzhou 730070, China
  • Online:2024-08-15 Published:2024-08-15

摘要: 针对道路场景中目标尺度多变、复杂背景干扰导致检测精度低、漏检率高的问题,提出一种改进YOLOv8n的道路目标检测算法。引入多样化分支块(diverse branch block,DBB)构建C2fDBB模块,替代原算法中的C2f模块,增强网络多尺度特征提取能力。在路径聚合网络(path aggregation network,PANet)的基础上结合渐进特征金字塔网络(asymptotic feature pyramid network,AFPN)思想,提出PA-AFPN(path aggregation progressive feature pyramid network)特征融合方式,提升网络对多尺度特征的融合能力。设计SPPF2_TA(SPPF with dual-branch structure incorporating triplet attention)模块,通过在SPPF(spatial pyramid pooling fast)中引入平均池化分支和三重注意力机制(triplet attention,TA),有效整合多尺度信息,降低背景干扰对检测的影响。采用MPDIoU作为新边界回归损失函数,替代原损失函数,加速算法收敛,提高目标定位精度。在公开道路目标数据集BDD100K和SODA10M上的实验结果显示,改进方法的mAP@0.5相较于基线算法分别提升了5.7个百分点和7.3个百分点,计算量降低了0.6 GFLOPs。与其他主流目标检测方法相比,改进方法在计算量、FPS和mAP@0.5等方面均展现出显著优势,更加契合道路场景下的目标检测任务需求。

关键词: YOLOv8, 结构重参数化, 渐进特征金字塔网络(AFPN), 道路目标, 注意力机制

Abstract: Addressing the challenges posed by varying object scales and complex background interference that result in low detection accuracy and high missed detection rates in road scenes, an enhanced road object detection algorithm is proposed based on YOLOv8n. Firstly, the diverse branch block (DBB) is introduced to construct the C2fDBB module, replacing the original C2f module, thereby enhancing the network capacity to extract multi-scale features. Secondly, building upon the path aggregation network (PANet), the asymptotic feature pyramid network (AFPN) concept is leveraged to propose the path aggregation progressive feature pyramid network (PA-AFPN) feature fusion method, enhancing the network ability to integrate multi-scale features effectively. Additionally, the SPPF (spatial pyramid pooling fast) with dual-branch structure incorporating triplet attention (SPPF2_TA) module is designed, which efficiently integrates multi-scale information through an average pooling branch and triplet attention (TA) mechanism, effectively reducing the impact of background interference on detection. Finally, MPDIoU is adopted as the new boundary regression loss function to replace the original loss function, expediting algorithm convergence and enhancing object localization precision. Experimental results on the public road benchmark datasets BDD100K and SODA10M demonstrate that the improved algorithm achieves an increase of 5.7?percentage points and 7.3?percentage points in mAP@0.5 compared to baseline algorithms, with a reduction in computational load by 0.6 GFLOPs. Compared to other mainstream object detection methods, the proposed algorithm shows notable advantages in terms of FLOPs, FPS, and mAP@0.5, making it more suitable for object detection tasks in road scenes.

Key words: YOLOv8, structural reparameterization, asymptotic feature pyramid network (AFPN), road object, attention mechanism