计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (21): 141-150.DOI: 10.3778/j.issn.1002-8331.2207-0221

• 模式识别与人工智能 • 上一篇    下一篇

结合级联注意力机制的车辆检测算法

邓天民,刘旭慧,王丽,王春霞   

  1. 重庆交通大学 交通运输学院,重庆 400074
  • 出版日期:2023-11-01 发布日期:2023-11-01

Vehicle Detection Algorithm Combined with Cascading Attention Mechanism

DENG Tianmin, LIU Xuhui, WANG Li, WANG Chunxia   

  1. School of Traffic and Transportation, Chongqing Jiaotong University, Chongqing 400074, China
  • Online:2023-11-01 Published:2023-11-01

摘要: 针对车辆检测过程中,复杂背景影响较大、远场景小目标及密集遮挡目标特征提取难度较大的问题,提出一种结合级联注意力机制的车辆检测算法CAM-YOLO。构建了一种级联注意力特征提取模块,分别从通道和空间角度出发为特征信息赋予不同的权重,强化关键特征表达能力的同时抑制无关背景信息的影响。采用多尺度特征检测方法,构建一个含有更多细节信息的大尺度特征图,加强目标检测器对远场景小目标的特征提取能力。采用DIOU_NMS后处理方法,同时考虑预测框重叠区域与中心点之间的距离,精准回归预测框,提升密集遮挡车辆目标检测效果。实验结果表明,相较于基线算法YOLOv5s,该算法在KITTI数据集与BDD100K数据集上的平均精度均值分别达到了98.13%与60.60%,模型检测速率分别达到了76.92?FPS与58.82?FPS,在执行复杂背景、远场景以及密集遮挡下的车辆检测任务时具有更好的表现。

关键词: 车辆检测, 级联注意力机制, 多尺度特征检测, DIOU_NMS方法

Abstract: Aiming at the problem that in the process of vehicle detection, complex background influence is large, small targets in distant scenes and dense occlusion targets are difficult to extract features, a vehicle detection algorithm CAM-YOLO combined with cascading attention mechanism is proposed. Firstly, a cascading attention feature extraction module is constructed, which gives different weights to feature information from the perspective of channel and space, respectively, to strengthen the expression ability of key features while inhibiting the influence of irrelevant background information. Secondly, a multi-scale feature detection method is used to construct a large-scale feature map with more detailed information, which strengthens the feature extraction ability of the target detector for small targets in the far scene. Finally, the DIOU_NMS post-processing method is adopted, and the distance between the overlapping area of the prediction frame and the center point is considered, and the prediction frame is accurately returned to improve the detection effect of densely blocked vehicle targets. Experimental results show that compared with the baseline algorithm YOLOv5s, the average accuracy of the algorithm on the KITTI dataset and the BDD100K dataset reaches 98.13% and 60.60%, respectively, and the model detection rate reaches 76.92 FPS and 58.82 FPS, respectively, which has better performance in performing vehicle detection tasks under complex backgrounds, far scenes and dense occlusion.

Key words: vehicle detection, cascading attention mechanism, multiscale feature detection, DIOU_NMS method