Vehicle Detection Algorithm Based on Dual Branch Feature Aggregation Network

doi:10.3778/j.issn.1002-8331.2405-0401

Abstract

Abstract: Vehicle target detection is an important part of autonomous driving. Existing vehicle target detection algorithms have not fully considered the advantages and disadvantages of CNN (convolutional neural network) and Transformer in feature extraction, which to some extent limits the overall performance of the network. This paper proposes a dual branch feature aggregation network consisting of CNN and Transformer. In the encoding stage, based on the respective advantages of CNN and Transformer, a dual branch backbone network is constructed to extract the feature information of the original image. By designing a multi-level spatial attention module and a dual branch feature aggregation module, the feature information between the two branches is guided to learn from each other. Finally, a dual branch attention module is constructed to further reduce the loss of feature information in deep neural networks. In the experimental section, the effectiveness of the proposed algorithm is further verified through ablation experiments and comparative experiments. Compared to mainstream object detection algorithms, it has improved by about 3.5% in the mAP (mean average precision) metric.

Key words: vehicle target detection, convolutional neural network (CNN), Transformer, dual branch, guided learning

摘要： 车辆目标检测是自动驾驶的重要环节，现有的车辆目标检测算法在特征提取方面没有充分考虑卷积神经网络（convolutional neural network，CNN）和Transformer各自的优缺点，一定程度上限制了网络的整体性能。提出了一种由CNN和Transformer组成的双分支特征聚合网络。在编码阶段，基于CNN和Transformer各自的优势，构建了双分支主干网络来提取原始图像的特征信息；通过设计的多级别空间注意力模块和双支路特征聚合模块，使两个分支间的特征信息相互引导学习；通过构建的双分支注意力模块来进一步减少深层神经网络中特征信息的丢失。在实验部分通过消融实验和对比实验进一步验证了所提算法的有效性，其相比主流的目标检测算法，在mAP（mean average precision）指标上提升了约3.5%。

关键词: 车辆目标检测, 卷积神经网络（CNN）, Transformer, 双分支, 引导学习

LYU Meng, MAO Shenghui, CHAI Liang, GAO Pengfei, SHI Lei. Vehicle Detection Algorithm Based on Dual Branch Feature Aggregation Network[J]. Computer Engineering and Applications, 2024, 60(22): 240-250.

吕蒙, 毛盛辉, 柴亮, 高鹏飞, 时蕾. 基于双分支特征聚合网络的车辆检测算法[J]. 计算机工程与应用, 2024, 60(22): 240-250.

References

[1] GHAHREMANNEZHAD H, SHI H, LIU C. Object detection in traffic videos: a survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(7): 6780-6799.
[2] WANG Y, YANG G, GUO J. Vehicle detection in surveillance videos based on YOLOv5 lightweight network[J]. Bulletin of the Polish Academy of Sciences Technical Sciences, 2022, 70(6): e143644.
[3] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Oct 11-14, 2016. Cham: Springer, 2016: 21-37.
[4] REDMOM J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[5] HE K, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, 2017: 2961-2969.
[6] GIRSHICK R. Fast R-CNN[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015: 1440-1448.
[7] 张利丰, 田莹. 改进 YOLOv8 的多尺度轻量型车辆目标检测算法[J]. 计算机工程与应用, 2024, 60(3): 129-137.
ZHANG L F, TIAN Y. ?Improved multi-scale lightweight vehicle target detection algorithm for YOLOv8[J]. Computer Engineering and Applications, 2024, 60(3): 129-137.
[8] 许晓阳, 高重阳. 改进 YOLOv7-tiny 的轻量级红外车辆目标检测算法[J]. 计算机工程与应用, 2024, 60(1): 74-83.
XU X Y, GAO C Y. Improved YOLOv7-tiny lightweight infrared vehicle target detection algorithm[J]. Computer Engineering and Applications, 2024, 60(1): 74-83.
[9] 颜豪男, 吕伏, 冯永安. 特征级自适应增强的无人机目标检测算法[J]. 计算机科学与探索, 2024, 18(6): 1566-1578.
YAN H N, LYU F, FENG Y A. Feature-level adaptive enhancement for UAV target detection algorithm[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(6): 1566-1578.
[10] 宋建辉, 王思宇, 刘砚菊, 等. 基于改进 FFRCNN 网络的无人机地面小目标检测算法[J]. 电光与控制, 2022, 29(7): 69-73.
SONG J H, WANG S Y, LIU Y J, et al. ?Ground small target detection algorithm of UAV based on improved FFRCNN network[J]. Electronics Optics & Control, 2022, 29(7): 69-73.
[11] 李松江, 吴宁, 王鹏, 等. 基于改进 Cascade RCNN 的车辆目标检测方法[J]. 计算机工程与应用, 2021, 57(5): 123-130.
LI S J, WU N, WANG P, et al. ?Vehicle target detection method based on improved Cascade RCNN[J]. Computer Engineering and Applications, 2021, 57(5): 123-130.
[12] 谢光达, 李洋, 曲洪权, 等. 基于改进 Transformer 的小目标车辆精确检测算法[J]. 激光与光电子学进展, 2022, 59(18): 364-371.
XIE G D, LI Y, QU H Q, et al. ?Small target accurate vehicle detection algorithm based on improved transformer[J]. Laser & Optoelectronics Progress, 2022, 59(18): 364-371.
[13] SONG Y, HONG S, HU C, et al. MEB-YOLO: an efficient vehicle detection method in complex traffic road scenes[J]. Computers, Materials & Continua, 2023, 75(3): 5761-5784.
[14] WANG J, DONG Y, ZHAO S, et al. A high-precision vehicle detection and tracking method based on the attention mechanism[J]. Sensors, 2023, 23(2): 724.
[15] BIE M, LIU Y, LI G, et al. Real-time vehicle detection algorithm based on a lightweight you-only-look-once (YOLOv5n-L) approach[J]. Expert Systems with Applications, 2023, 213(C): 119108.
[16] ZHANG Y, SUN Y, WANG Z, et al. YOLOv7-RAR for urban vehicle detection[J]. Sensors, 2023, 23(4): 1801.
[17] MAURICIO J, DOMINGUES I, BERNARDINO J. Comparing vision transformers and convolutional neural networks for image classification: a literature review[J]. Applied Sciences, 2023, 13(9): 5521.
[18] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30, 2017: 5998-6008.
[19] GE Z, LIU S, WANG F, et al. YOLOx: exceeding YOLO series in 2021[J]. arXiv:2107.08430, 2021.
[20] WU H, XIAO B, CODELLA N, et al. CVT: introducing convolutions to vision transformers[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021: 22-31.
[21] WEN L, DU D, CAI Z, et al. UA-DETRAC: a new benchmark and protocol for multi-object detection and tracking[J]. Computer Vision and Image Understanding, 2020, 193: 102907.
[22] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, 2017: 2980-2988.
[23] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018.
[24] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[J]. arXiv:2004.10934, 2020.
[25] TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 10781-10790.
[26] ZHOU X, WANG D, KRAHENBUHL P. Objects as points[J]. arXiv:1904.07850, 2019.
[27] DU Y, JIANG X. A real-time small target vehicle detection algorithm with an improved YOLOv5m network model[J]. Computers, Materials & Continua, 2024, 78(1): 303-327.
[28] HUI Y, WANG J, LI B. STF-YOLO: a small target detection algorithm for UAV remote sensing images based on improved Swin Transformer and class weighted classification decoupling head[J]. Measurement, 2024, 224: 113936.