Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (8): 89-96.DOI: 10.3778/j.issn.1002-8331.2112-0408

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

Dense Road Vehicle Detection Based on Lightweight ConvLSTM

JIN Zhi, ZHANG Qian, LI Xiying   

  1. 1.School of Intelligent Systems Engineering, Sun Yat-sen University, Guangzhou 510006, China
    2.Guangdong Province Key Laboratory of Intelligent Transportation System, Guangzhou 510006, China
    3.Key Laboratory of Video and Image Intelligent Analysis and Application Technology, Ministry of Public Security, Guangzhou 510006, China
  • Online:2023-04-15 Published:2023-04-15



  1. 1.中山大学 智能工程学院,广州 510006
    2.广东省智能交通系统重点实验室,广州 510006
    3.视频图像智能分析与应用技术公安部重点实验室,广州 510006

Abstract: Aiming at the problems of the false negatives and false positives caused by target occlusion in congested scenes, considering that the overlap degree of the same vehicle in the video is different at different moments, the features provided by the vehicle at the unobstructed time can help the detection of the target vehicle at the current moment. WB-YOLO v5 suitable for dense scenes is proposed. Based on the input data structure of ConvLSTM, feature selection and feature sparsity modules are designed to realize feature recalibration. The features output by the feature selection and feature sparse modules are sent to different branches of ConvLSTM to realize the enhancement and attenuation of features at different times. Then 1×1 convolution is used to replace the original gating structure, and a lightweight WBConvLSTM is constructed to reduce the number of parameters and calculations. It also improves the training speed and detection accuracy of small sample data source targets. Finally, WBConvLSTM is introduced into the Neck of YOLO v5 to enhance the feature extraction ability of the network. Experimental results show that compared with YOLO v5, WB-YOLO V5 has 1.83 percentage points improvement of mAP. Compared with ConvLSTM, WBConvLSTM reduces the number of parameters and calculations about 2/3 and 6/13, respectively.

Key words: dense vehicles, lightweight ConvLSTM, feature sparse, feature selection, spatiotemporal information

摘要: 针对拥堵场景中目标遮挡引起的漏报、误报等问题,考虑到视频中同一车辆在不同时刻下重叠程度不同,利用未被遮挡时刻车辆所提供的特征有助于当前时刻目标车辆的检测,提出了一种适用于密集场景的车辆检测算法WB-YOLO v5。算法结合ConvLSTM模型的输入数据结构,设计了特征选择和特征稀疏模块,实现了特征的重标定;并将特征选择和特征稀疏模块输出的特征送入ConvLSTM的不同支线,实现了不同时刻特征的强化与衰减;再使用1×1卷积替换原始门控结构,构建轻量化的WBConvLSTM,以减少参数量和计算量,提升训练速度与小样本数据源目标的检测准确率;在YOLO v5的Neck端引入WBConvLSTM,实现网络特征提取能力的增强。实验结果表明,相比于YOLO v5,WB-YOLO v5的检测平均准确率有1.83个百分点的提高。相比于ConvLSTM,WBConvLSTM的参数量和计算量分别减少约2/3和6/13。

关键词: 密集车辆, 轻量化ConvLSTM, 特征稀疏, 特征选择, 时空信息