计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (21): 167-175.DOI: 10.3778/j.issn.1002-8331.2212-0361

• 图形图像处理 • 上一篇    下一篇

改进YOLOX-S实时多尺度交通标志检测算法

王能文,张涛   

  1. 江南大学 人工智能与计算机学院,江苏 无锡 214000
  • 出版日期:2023-11-01 发布日期:2023-11-01

Improved YOLOX-S Real-Time Multi-Scale Traffic Sign Detection Algorithm

WANG Nengwen, ZHANG Tao   

  1. College of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214000, China
  • Online:2023-11-01 Published:2023-11-01

摘要: 交通标志检测对于无人驾驶系统来说是一项具有挑战性的任务。针对交通标志检测过程中,目标小、受背景环境影响等难点,提出一种基于改进YOLOX-S的算法。设计ResNet50-vd-dcn替换原YOLOX-S中的CSPDarknet53主干网络,使用ResNet-D结合可变性卷积,减少了模型的计算量同时也保证了网络的学习能力。提出增强特征图模块,该模块利用特征图连接流和注意力机制流来减少特征图生成过程中的信息丢失,进而提高模型的表示能力。提出一种三通道加权双向特征金字塔网络替换原有特征金字塔结构,可以有效加强特征融合,提高多尺度目标识别能力。为增加模型对正样本的学习,在后处理阶段引入Focal Loss损失函数。实验结果表明,与原YOLOX-S算法相比,在TT100K数据集上小目标精度、小目标召回率以及mAP分别提升了2.8、4.1、2.1个百分点,同时检测速度快了2.3?FPS。在CCTSDB数据上mAP提升了1.1个百分点,检测速度为120?FPS,满足实时检测的要求。

关键词: 交通标志检测, YOLOX-S, 小目标检测, 特征增强, 注意力机制流

Abstract: Traffic sign detection is a challenging task for driverless systems. In traffic sign detection, the target is tiny and is affected by the background environment. An algorithm based on improved YOLOX-S is proposed. ResNet50-vd-dcn is designed to replace the CSPDarknet53 backbone network in the original YOLOX-S. Using ResNet-D combined with variable convolution reduces the calculation amount of model while ensuring learning ability of the network. An enhanced feature map module is proposed, which utilizes the feature map connection and attention mechanism flow to reduce the information loss in the feature map generation process and improve the representation ability of model. A three-channel weighted bidirectional feature pyramid network is proposed to replace the original feature pyramid structure, which can effectively strengthen feature fusion and improve multi-scale object recognition capabilities. At the same time, to increase the learning of the model to the positive samples, the focal loss function is introduced in the post-processing stage. The experimental results show that compared with the original YOLOX-S algorithm, the small target precision, small target recall rate, and mAP on the TT100K dataset are increased by 2.8, 4.1, and 2.1 percentage points, respectively, and the detection speed is 2.3?FPS faster. On the CCTSDB dataset, the mAP has risen by 1.1?percentage points, and the detection speed is 120?FPS, meeting the requirements for real-time detection.

Key words: traffic sign detection, YOLOX-S, small target detection, feature enhancement, flow of attention mechanism