计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (16): 212-222.DOI: 10.3778/j.issn.1002-8331.2304-0251

• 图形图像处理 • 上一篇    下一篇

改进YOLOv5的复杂环境道路目标检测方法

袁磊,唐海,陈彦蓉,高刃,吴文欢   

  1. 湖北汽车工业学院 电气与信息工程学院,湖北 十堰 442002
  • 出版日期:2023-08-15 发布日期:2023-08-15

Improved YOLOv5 for Road Target Detection in Complex Environments

YUAN Lei, TANG Hai, CHEN Yanrong, GAO Ren, WU Wenhuan   

  1. School of Electrical and Information Engineering, Hubei University of Automotive Technology, Shiyan, Hubei 442002, China
  • Online:2023-08-15 Published:2023-08-15

摘要: 为解决复杂环境下道路目标检测任务中由于目标尺度变化多样、密集遮挡以及光照不均匀等导致的漏检问题,提出了一种基于YOLOv5的道路目标检测改进方法CTC-YOLO(contextual transformer and convolutional block attention module based on YOLOv5)。针对小目标,改进网络检测头结构,增加多尺度目标检测层,提高小目标检测精度。为了充分利用输入的上下文信息,在特征提取部分引入上下文变换模块(contextual transformer networks,CoTNet),设计了CoT3模块,引导动态注意力矩阵学习,提高视觉表征能力。在Neck部分的C3模块集成卷积块注意力模型(convolutional block attention module,CBAM),以在各种复杂的场景中找到注意力区域。为进一步验证CTC-YOLO方法,采取了一些有用的策略,如模型集成位置选择和对比其他注意力机制。实验结果表明,在公开数据集KITTI、Cityscapes以及BDD100K上mAP@0.5分别达到89.6%、46.1%和57.0%,相较基线模型分别提高3.1个百分点、2.0个百分点和1.2个百分点。与其他模型相比,检测效率更高,有效改善了复杂环境中的目标检测问题。

关键词: 复杂环境, 目标检测, YOLOv5, 注意力机制

Abstract: To solve the problem of missed detection in road object detection tasks in complex environments due to diverse target scale changes, dense occlusion and uneven lighting, an improved method for road object detection CTC-YOLO(context transformer and convolutional block attention module based on YOLOv5) is proposed. Firstly, for small targets, improve the network detection head structure, add a multi-scale target detection layer, and improve the accuracy of small target detection. Secondly, in order to fully utilize the input contextual information, introduce a context transformer networks(CoTNet) module in the feature extraction section, and design a CoT3 module to guide dynamic attention matrix learning and improve visual representation ability. Finally, the C3 module in the Neck section integrates the convolutional block attention module(CBAM) to locate attention regions in complex scenes. To further validate the CTC-YOLO method proposed in this paper, some useful strategies are adopted, such as model integration position selection and comparison with other attention mechanisms. The experimental results show that the mAP@0.5 on the publicly available datasets KITTI, Cityscapes and BDD100K reaches 89.6%, 46.1% and 57.0%, respectively, which are 3.1, 2.0 and 1.2 percentage points higher than the baseline model, respectively. Compared with other models, the detection efficiency is higher and effectively improves the problem of object detection in complex environments.

Key words: complex environment, target detection, YOLOv5, attentional mechanism