计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (18): 316-323.DOI: 10.3778/j.issn.1002-8331.2306-0223

• 工程与应用 • 上一篇    下一篇

基于全维动态卷积的交通标志识别

李文举,于杰,沙利业,崔柳,杨红喆   

  1. 1.上海应用技术大学 计算机科学与信息工程学院,上海 201418
    2.上海普利森配料系统有限公司,上海 201108
  • 出版日期:2024-09-15 发布日期:2024-09-13

Traffic Sign Recognition Based on Omni-Dimensional Dynamic Convolution

LI Wenju, YU Jie, SHA Liye, CUI Liu, YANG Hongzhe   

  1. 1.School of Computer Science and Information Engineering, Shanghai Institute of Technology, Shanghai 201418, China
    2.Shanghai Precision Weighing and Dosing System Co., Ltd., Shanghai 201108, China
  • Online:2024-09-15 Published:2024-09-13

摘要: 针对现有交通标志识别算法对于小目标和遮挡目标的识别精度不高,且识别速度较慢的问题,通过改进YOLOv5网络,设计一种基于全维动态卷积(ODConv)的交通标志识别算法。将主干网络中的部分卷积替换为全维动态卷积,以便在特征提取过程中获取更丰富的信息,提高网络对小目标的敏感度;为了减少上采样过程中信息的丢失,在特征融合网络中使用亚像素卷积模块替换原有的最近邻插值上采样模块,并使用高效层聚合模块替换原有的跨阶段层次模块,提高特征融合效率,延长梯度最短路径,改善小目标检测效果;使用SIoU函数计算回归损失,解决真实框与预测框之间方向不匹配的问题,进一步提高对道路交通标志的检测精度。在TT100K数据集上测试本模型,平均精度(mAP@0.5)达到了93.85%,召回率(Recall)达到了90.73%,与基准网络YOLOv5n相比分别提高了3.90%和5.69%,帧处理速度达到89.29。

关键词: 交通标志识别, YOLOv5, 全维动态卷积, 亚像素卷积模块, 高效层聚合模块

Abstract: Aiming at the problem that the existing traffic sign recognition algorithms have low recognition accuracy and slow recognition speed for small and occluded targets, a traffic sign recognition algorithm based on omni-dimensional dynamic convolution (ODConv) is designed by improving YOLOv5 network. Firstly, the partial convolution in the backbone network is replaced by ODConv, so as to obtain more information in the process of feature extraction and improve the sensitivity of the network to small targets. Then, in order to reduce the loss of information in the upsampling process, the sub-pixel convolution module is used to replace the original nearest neighbor interpolation upsampling module in the feature fusion network, and the efficient layer aggregation module is used to replace the original CSPNet module to improve the feature fusion efficiency, extend the gradient shortest path, and improve the small target detection effect. Finally, the SIoU function is used to calculate the regression loss, solve the problem of direction mismatch between the real frame and the prediction frame, and further improve the detection accuracy of road traffic signs. Testing the model on TT100K dataset, and the average accuracy (mAP@0.5) reaches 93.85%, and the recall rate reaches 90.73%. Compared with the benchmark network YOLOv5n, it has increased by 3.90% and 5.69% respectively, and the frame processing speed reaches 89.29.

Key words: traffic sign recognition, YOLOv5, omni-dimensional dynamic convolution, sub-pixel convolution module, efficient layer aggregation module