Improved YOLOX-S Real-Time Multi-Scale Traffic Sign Detection Algorithm

doi:10.3778/j.issn.1002-8331.2212-0361

Abstract

Abstract: Traffic sign detection is a challenging task for driverless systems. In traffic sign detection, the target is tiny and is affected by the background environment. An algorithm based on improved YOLOX-S is proposed. ResNet50-vd-dcn is designed to replace the CSPDarknet53 backbone network in the original YOLOX-S. Using ResNet-D combined with variable convolution reduces the calculation amount of model while ensuring learning ability of the network. An enhanced feature map module is proposed, which utilizes the feature map connection and attention mechanism flow to reduce the information loss in the feature map generation process and improve the representation ability of model. A three-channel weighted bidirectional feature pyramid network is proposed to replace the original feature pyramid structure, which can effectively strengthen feature fusion and improve multi-scale object recognition capabilities. At the same time, to increase the learning of the model to the positive samples, the focal loss function is introduced in the post-processing stage. The experimental results show that compared with the original YOLOX-S algorithm, the small target precision, small target recall rate, and mAP on the TT100K dataset are increased by 2.8, 4.1, and 2.1 percentage points, respectively, and the detection speed is 2.3?FPS faster. On the CCTSDB dataset, the mAP has risen by 1.1?percentage points, and the detection speed is 120?FPS, meeting the requirements for real-time detection.

Key words: traffic sign detection, YOLOX-S, small target detection, feature enhancement, flow of attention mechanism

摘要： 交通标志检测对于无人驾驶系统来说是一项具有挑战性的任务。针对交通标志检测过程中，目标小、受背景环境影响等难点，提出一种基于改进YOLOX-S的算法。设计ResNet50-vd-dcn替换原YOLOX-S中的CSPDarknet53主干网络，使用ResNet-D结合可变性卷积，减少了模型的计算量同时也保证了网络的学习能力。提出增强特征图模块，该模块利用特征图连接流和注意力机制流来减少特征图生成过程中的信息丢失，进而提高模型的表示能力。提出一种三通道加权双向特征金字塔网络替换原有特征金字塔结构，可以有效加强特征融合，提高多尺度目标识别能力。为增加模型对正样本的学习，在后处理阶段引入Focal Loss损失函数。实验结果表明，与原YOLOX-S算法相比，在TT100K数据集上小目标精度、小目标召回率以及mAP分别提升了2.8、4.1、2.1个百分点，同时检测速度快了2.3?FPS。在CCTSDB数据上mAP提升了1.1个百分点，检测速度为120?FPS，满足实时检测的要求。

关键词: 交通标志检测, YOLOX-S, 小目标检测, 特征增强, 注意力机制流

WANG Nengwen, ZHANG Tao. Improved YOLOX-S Real-Time Multi-Scale Traffic Sign Detection Algorithm[J]. Computer Engineering and Applications, 2023, 59(21): 167-175.

王能文, 张涛. 改进YOLOX-S实时多尺度交通标志检测算法[J]. 计算机工程与应用, 2023, 59(21): 167-175.

References

[1] SOUANI C，FAIEDH H，BESBES K.Efficient algorithm for automatic road sign recognition and its hardware implementation[J].Journal of Real-Time Image Processing，2014，9（1）：79-93.
[2] MURAN J，HAIBO L，ZHONG B W.Improved Yolov3 algorithm and its application in small target detection[J].Acta Optica Sinica，2019，39（7）：115-121.
[3] LUO H，CHEN H.Survey of object detection based on deep learning[J].Acta Electonica Sinica，2020，48（6）：1230.
[4] DALAL N，TRIGGS B.Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition（CVPR’05），2005：886-893.
[5] LOWE D G.Distinctive image features from scale-invariant key points[J].International Journal of Computer Vision，2004，60（2）：91-110.
[6] LIENHART R，MAYDT J.An extended set of Haar-like features for rapid object detection[C]//Proceedings International Conference on Image Processing，2002.
[7] CRISTIANINI N，SHAWE-TAYLOR J.An introduction to support vector machines and other kernel-based learning methods[M].[S.l.]：Cambridge University Press，2000.
[8] FREUND Y，SCHAPIRE R E.Experiments with a new boosting algorithm[C]//Proceedings of 13th International Conference on Machine Learning，1996：148-156.
[9] LIAW A，WIENER M.Classification and regression by random Forest[J].R News，2002，2（3）：18-22.
[10] KRIZHEVSKY A，SUTSKEVER I，HINTON G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM，2017，60（6）：84-90.
[11] REDMON J，FARHADI A.YOLO9000：better，faster，stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：7263-7271.
[12] ZHANG J，LIU Y，LIU H，et al.Distractor-aware visual tracking using hierarchical correlation filters adaptive selection[J].Applied Intelligence，2022，52（6）：6129-6147.
[13] ZHANG J，JIN X，SUN J，et al.Dual model learning combined with multiple feature selection for accurate visual tracking[J].IEEE Access，2019，7：43956-43969.
[14] NING X，GONG K，LI W，et al.Feature refinement and filter network for person re-identification[J].IEEE Transactions on Circuits and Systems for Video Technology，2020，31（9）：3391-3402.
[15] LIU Z，QI M，SHEN C，et al.Cascade saccade machine learning network with hierarchical classes for traffic sign detection[J].Sustainable Cities and Society，2021，67：102700.
[16] SHEN L，YOU L，PENG B，et al.Group multi-scale attention pyramid network for traffic sign detection[J].Neurocomputing，2021，452：1-14.
[17] 刘紫燕，袁磊，朱明成，等.融合SPP和改进FPN的YOLOv3交通标志检测[J].计算机工程与应用，2021，57（7）：164-170.
LIU Z Y，YUAN L，ZHU M C，et al.YOLOv3 traffic sign detection based on SPP and improved FPN[J].Computer Engineering and Applications，2021，57（7）：164-170.
[18] 胡昭华，王莹.改进YOLOv5的交通标志检测算法[J].计算机工程与应用，2023，59（1）：82-91.
HU Z H，WANG Y.Improved YOLOv5 traffic sign detection algorithm[J].Computer Engineering and Applications，2023，59（1）：82-91.
[19] GE Z，LIU S，WANG F，et al.Yolox：exceeding yolo series in 2021[J].arXiv：2107.08430，2021.
[20] DAI J，QI H，XIONG Y，et al.Deformable convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision，2017：764-773.
[21] LIU S，QI L，QIN H，et al.Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：8759-8768.
[22] LIN T Y，GOYAL P，GIRSHICK R，et al.Focal loss for dense object detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2020，42（2）：318-327.
[23] WANG C Y，LIAO H Y M，WU Y H，et al.CSPNet：a new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops，2020：390-391.
[24] HE K，ZHANG X，REN S，et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：770-778.
[25] TAN M，PANG R，LE Q V.EfficientDet：scalable and efficient object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition，2020：10778-10787.
[26] GUO M H，LU C Z，LIU Z N，et al.Visual attention network[J].arXiv：2202.09741，2022.
[27] DOSOVITSKIY A，BEYER L，KOLESNIKOV A，et al.An image is worth 16x16 words：Transformers for image recognition at scale[J].arXiv：2010.11929，2020.
[28] ZHU Z，LIANG D，ZHANG S，et al.Traffic-sign detection and classification in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：2110-2118.
[29] ZHANG J M，HUANG M T，JIN X K，et al.A real-time Chinese traffic sign detection algorithm based on modified YOLOv2[J].Algorithms，2017，10（4）：127-140.
[30] HU J，SHEN L，SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway：IEEE，2018：7132-7141.
[31] HOU Q，ZHOU D，FENG J.Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2021：13713-13722.
[32] WOO S，PARK J，LEE J Y，et al.CBAM：convolutional block attention module[C]//The 15th European Conference.Munich，Germany：Springer，2018：3-19.
[33] REDMON J，FARHADI A.Yolov3：an incremental improvement[J].arXiv：1804.02767，2018.
[34] BOCHKOVSKIY A，WANG C Y，LIAO H Y M.Yolov4：optimal speed and accuracy of object detection[J].arXiv：2004.10934，2020.
[35] ZHU X，SU W，LU L，et al.Deformable detr：deformable transformers for end-to-end object detection[J].arXiv：2010.04159，2020.
[36] REN S，HE K，GIRSHICK R，et al.Faster r-cnn：towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems，2015.
[37] SUN Z，CAO S，YANG Y，et al.Rethinking transformer-based set prediction for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision，2021：3611-3620.
[38] 王卜，何扬.基于改进YOLOv3的交通标志检测[J].四川大学学报（自然科学版），2022，59（1）：57-67.
WANG B，HE Y.Traffic sign detection based on improved YOLOv3[J].Journal of Sichuan University（Natural Science Edition），2022，59（1）：57-67.
[39] 尹宋麟，谭飞，周晴，等.基于改进YOLOv4模型的交通标志检测[J].无线电工程，2022，52（11）：2087-2093.
YIN S L，TAN F，ZHOU Q，et al.Traffic sign detection based on improved YOLOv4 model[J].Radio Engineering，2022，52（11）：2087-2093.
[40] Ultralytics.YOLOv5[EB/OL].[2022-04-10].https：//github.com/ultralytics/yolov5.
[41] 李旭东，张建明，谢志鹏，等.基于三尺度嵌套残差结构的交通标志快速检测算法[J].计算机研究与发展，2020，57（5）：1022-1036.
LI X D，ZHANG J M，XIE Z P，et al.A fast traffic sign detection algorithm based on three-scale nested residual structures[J].Journal of Computer Research and Development，2020，57（5）：1022-1036.
[42] 陈昌川，王海宁，赵悦，等.一种基于深度学习的交通标志识别新算法[J].电讯技术，2021，61（1）：76-82.
CHEN C C，WANG H N，ZHAO Y，et al.A novel traffic sign recognition algorithm based on deep learning[J].Tele-communication Engineering，2021，61（1）：76-82.
[43] 刘宇宸，石刚，崔青，等.改进MobileNetv3-YOLOv3交通标志牌检测算法[J].东北师大学报（自然科学版），2022，54（2）：53-60.
LIU Y C，SHI G，CUI Q，et al.Improved MobileNetv3-YOLOv3 traffic sign detection algorithm[J].Journal of Northeast Normal University（Natural Science Edition），2022，54（2）：53-60.