基于轻量化ConvLSTM的密集道路车辆检测算法

doi:10.3778/j.issn.1002-8331.2112-0408

摘要/Abstract

摘要： 针对拥堵场景中目标遮挡引起的漏报、误报等问题，考虑到视频中同一车辆在不同时刻下重叠程度不同，利用未被遮挡时刻车辆所提供的特征有助于当前时刻目标车辆的检测，提出了一种适用于密集场景的车辆检测算法WB-YOLO v5。算法结合ConvLSTM模型的输入数据结构，设计了特征选择和特征稀疏模块，实现了特征的重标定；并将特征选择和特征稀疏模块输出的特征送入ConvLSTM的不同支线，实现了不同时刻特征的强化与衰减；再使用1×1卷积替换原始门控结构，构建轻量化的WBConvLSTM，以减少参数量和计算量，提升训练速度与小样本数据源目标的检测准确率；在YOLO v5的Neck端引入WBConvLSTM，实现网络特征提取能力的增强。实验结果表明，相比于YOLO v5，WB-YOLO v5的检测平均准确率有1.83个百分点的提高。相比于ConvLSTM，WBConvLSTM的参数量和计算量分别减少约2/3和6/13。

关键词: 密集车辆, 轻量化ConvLSTM, 特征稀疏, 特征选择, 时空信息

Abstract: Aiming at the problems of the false negatives and false positives caused by target occlusion in congested scenes, considering that the overlap degree of the same vehicle in the video is different at different moments, the features provided by the vehicle at the unobstructed time can help the detection of the target vehicle at the current moment. WB-YOLO v5 suitable for dense scenes is proposed. Based on the input data structure of ConvLSTM, feature selection and feature sparsity modules are designed to realize feature recalibration. The features output by the feature selection and feature sparse modules are sent to different branches of ConvLSTM to realize the enhancement and attenuation of features at different times. Then 1×1 convolution is used to replace the original gating structure, and a lightweight WBConvLSTM is constructed to reduce the number of parameters and calculations. It also improves the training speed and detection accuracy of small sample data source targets. Finally, WBConvLSTM is introduced into the Neck of YOLO v5 to enhance the feature extraction ability of the network. Experimental results show that compared with YOLO v5, WB-YOLO V5 has 1.83 percentage points improvement of mAP. Compared with ConvLSTM, WBConvLSTM reduces the number of parameters and calculations about 2/3 and 6/13, respectively.

Key words: dense vehicles, lightweight ConvLSTM, feature sparse, feature selection, spatiotemporal information

金枝, 张倩, 李熙莹. 基于轻量化ConvLSTM的密集道路车辆检测算法[J]. 计算机工程与应用, 2023, 59(8): 89-96.

JIN Zhi, ZHANG Qian, LI Xiying. Dense Road Vehicle Detection Based on Lightweight ConvLSTM[J]. Computer Engineering and Applications, 2023, 59(8): 89-96.

参考文献

[1] SUNDERMEYER M，SCHKUTER R，NEY H.LSTM neural networks for language modeling[C]//The 13th Annual Conference of the International Speech Communication Association，2012：194-197.
[2] SHI X，CHEN Z，WANG H，et al.Convolutional LSTM network：a machine learning approach for precipitation nowcasting[J].arXiv：1506.04214，2015.
[3] BO H，HUANG H，LU H.Convolutional gated recurrent units fusion for video action recognition[C]//International Conference on Neural Information Processing.Cham：Springer，2017：114-223.
[4] TANG Q，YANG M，YANG Y.ST-LSTM：a deep learning approach combined spatio-temporal features for short-term forecast in rail transit[J].Journal of Advanced Transportation，2019，2019：1-8.
[5] ZHU G，ZHANG L.Redundancy and attention in convolutional LSTM for gesture recognition[J].IEEE Transactions on Neural Networks and Learning Systems，2019，31（4）：1323-1335.
[6] ZHOU X，SHEN Y，ZHU Y，et al.Predicting multi-step citywide passenger demands using attention-based neural networks[C]//Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining.Marina Del Rey，CA.USA：Association for Computing Machinery，2018：736-744.
[7] 吴哲夫，张令威，刘光宇，等.基于空间自适应卷积LSTM的视频预测[J].计算机应用与软件，2020，37（9）：62-67.
WU Zhefu，ZHANG Lingwei，LIU Guangyu，et al.Video prediction based on spatial adaptive ConvLSTM[J].Computer Applications and Software，2020，37（9）：62-67.
[8] 王兵，乐红霞，李文璟，等.改进YOLO轻量化网络的口罩检测算法[J].计算机工程与应用，2021，57（8）：62-69.
WANG Bing，LE Hongxia，LI Wenjing，et al.Mask detection algorithm based on improved YOLO lightweight network[J].Computer Engineering and Applications，2021，57（8）：62-69.
[9] 陈柳，陈明举，薛智爽，等.轻量化高精度卷积神经网络的安全帽识别方法[J].计算机工程与应用，2021，57（22）：177-181.
CHEN Liu，CHEN Mingju，XUE Zhishuang，et al.Lightweight and high-precision convolutional neural network for helmet recognition method[J].Computer Engineering and Applications，2021，57（22）：177-181.
[10] REDMON J，DIVVALA S，GIRSHICK R，et al.You only look once：unified，real-time object detection[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Washington：IEEE Computer Society，2016：779-788.
[11] LIU W，ANGUELOV D，ERHAN D，et al.SSD：single shot multibox detector[C]//Lecture Notes in Computer Science：9905.Heidelberg：Springer-Verlag，2016：21-37.
[12] REN S Q，HE K M，GIRSHICK R，et al.Faster R-CNN：towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2017，39（6）：1137-1149.
[13] REDMON J，FARHADI A.YOLO9000：better，faster，stronger[C]//Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition.Piscataway：IEEE，2017：6517-6525.
[14] REDMON J，FARHADI A.Yolov3：an incremental improvement[J].arXiv：1804.02767，2018.
[15] BOCHHKOVSKIY A，WANG C Y，LIAO H Y M.YOLOv4：optimal speed and accuracy of object detection[J].arXiv：2004.10934，2020.
[16] 王滢暄，宋焕生，梁浩翔，等.基于改进的YOLOv4高速公路车辆目标检测研究[J].计算机工程与应用，2021，57（13）：218-226.
WANG Yingxuan，SONG Huansheng，LIANG Haoxiang，et al.Highway vehicle object detection based on improved YOLOv4 method[J].Computer Engineering and Applications，2021，57（13）：218-226.
[17] 李震霄，孙伟，刘明明，等.交通监控场景中的车辆检测与跟踪算法研究[J].计算机工程与应用，2021，57（8）：103-111.
LI Zhenxiao，SUN Wei，LIU Mingming，et al.Research on vehicle detection and tracking algorithms in traffic monitoring scenes[J].Computer Engineering and Applications，2021，57（8）：103-111.
[18] JOCHER G，STOKEN A，BOROVEC J，et al.Yolov5[EB/OL].[2020-08-13].https：//github.com/ultraly tics/yoloV5.
[19] HUANG X，WANG X，LV W，et al.PP-YOLOv2：a practical object detector[J].arXiv：2104.10419，2021.
[20] TAN M，PANG R，LE Q V.Efficientdet：scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：10781-10790.