计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (11): 144-155.DOI: 10.3778/j.issn.1002-8331.2409-0128

• 目标检测专题 • 上一篇    下一篇

RO-YOLOv9车辆行人检测算法

廖炎华,万学俊,赵周洲,潘文林   

  1. 1.云南民族大学 电气信息工程学院,昆明 650500
    2.玉溪市公安局 科技信息化支队,云南 玉溪 653100
    3.云南民族大学 数学与计算机科学学院,昆明 650500
  • 出版日期:2025-06-01 发布日期:2025-05-30

RO-YOLOv9 Vehicle and Pedestrian Detection Algorithm

LIAO Yanhua, WAN Xuejun, ZHAO Zhouzhou, PAN Wenlin   

  1. 1.School of Electrical and Information Engineering, Yunnan Minzu University, Kunming 650500, China 
    2.Science and Technology Information Brigade, Yuxi Public Security Bureau, Yuxi, Yunnan 653100, China
    3.School of Mathematics and Computer Science, Yunnan Minzu University, Kunming 650500, China
  • Online:2025-06-01 Published:2025-05-30

摘要: 针对道路交通环境中车辆和行人目标较小或被遮挡导致的检测精度低以及误检、漏检问题,提出道路目标检测算法RO-YOLOv9。增加小目标检测层,增强算法对小目标的特征学习能力。设计双向与自适应尺度融合特征金字塔网络(bidirectional and adaptive scale fusion feature pyramid network,BiASF-FPN)结构,优化多尺度特征融合,保证算法有效捕捉从小尺度到大尺度目标的详细信息。提出OR-RepN4模块,通过重参数化策略,复杂算法结构简单化,提高推理速度。引用Shape-NWD(shape neighborhood weighted decomposition)损失函数,专注边界框形状与尺寸,采用归一化高斯Wasserstein距离平滑回归,实现跨尺度不变性,降低小尺度与遮挡目标的检测误差。实验结果表明,在优化后的SODA10M和BDD100K数据集下,RO-YOLOv9算法的mAP@0.5(mean average precision)分别达到68.1%和56.8%,比YLOLOv9算法提高5.6个百分点和4.4个百分点,并且检测帧率分别达到了55.3 帧/s和54.2 帧/s,达到检测精度和检测速度的平衡。

关键词: YOLOv9, 小目标检测, 双向与自适应尺度融合特征金字塔网络(BiASF-FPN), OR-RepN4, Shape-NWD

Abstract: Aiming at the low detection accuracy, false detection and missed detection problems caused by small or occluded vehicle and pedestrian targets in road traffic environments, a road target detection algorithm RO-YOLOv9 is proposed. Firstly, the small target detection layer is added to enhance the algorithm’s feature learning ability for small targets. Secondly, the bidirectional and adaptive scale fusion feature pyramid network (BiASF-FPN) structure is designed to optimize multi-scale feature fusion and ensure that the algorithm effectively captures detailed information from small to large scale targets. Thirdly, the OR-RepN4 module is proposed to simplify the complex algorithm structure and improve the inference speed through the re-parameterization strategy. Finally, the shape neighborhood weighted decomposition (Shape-NWD) loss function is used to focus on the shape and size of the bounding box, and the normalized Gaussian Wasserstein distance smoothing regression is used to achieve cross-scale invariance and reduce the detection error of small-scale and occluded targets. The experimental results show that under the optimized SODA10M and BDD100K datasets, the mAP@ 0.5 (mean average precision) of RO-YOLOv9 algorithm reaches 68.1% and 56.8%, respectively, which is 5.6 percentage points and 4.4 percentage points higher than that of the YLOLOv9 algorithm, and the detection frame rates reach 55.3 and 54.2 frames per second, respectively, achieving a balance between detection accuracy and detection speed.

Key words: YOLOv9, small target detection, bidirectional and adaptive scale fusion feature pyramid network(BiASF-FPN), OR-RepN4, shape neighborhood weighted decomposition(Shape-NWD)