计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (5): 271-279.DOI: 10.3778/j.issn.1002-8331.2108-0346

• 工程与应用 • 上一篇    下一篇

改进YOLOv5s+DeepSORT的监控视频车流量统计

李永上,马荣贵,张美月   

  1. 长安大学 信息工程学院,西安 710064
  • 出版日期:2022-03-01 发布日期:2022-03-01

Traffic Monitoring Video Vehicle Volume Statistics Method Based on Improved YOLOv5s+DeepSORT

LI Yongshang, MA Ronggui, ZHANG Meiyue   

  1. School of Information Engineering, Chang’an University, Xi’an 710064, China
  • Online:2022-03-01 Published:2022-03-01

摘要: 针对监控视频中车流量统计准确率低的问题,提出一种改进YOLOv5s检测结合Deep SORT跟踪的车流量统计方法。为了提升检测器识别效果,将注意力模块CBAM与YOLOv5s网络的Neck部分融合,提高网络的特征提取能力;将CIoU Loss代替GIoU Loss作为目标边界框回归损失函数,加快边界框回归速率的同时提高定位精度;使用DIoU-NMS替换NMS,改善目标拥挤时的漏检问题。调整Deep SORT外观特征提取网络的结构,并在车辆重识别数据集上重新训练,降低目标遮挡导致的身份切换。连接改进的YOLOv5s检测器和Deep SORT,在视频中设置虚拟检测线统计车流量。实验结果表明:改进的YOLOv5s相较原始算法平均准确率提高2.3个百分点,结合Deep SORT跟踪,在行车平峰、高峰、夜间三种场景的车流量统计准确率达到93.5%、91.2%、89.9%。

关键词: YOLOv5s, Deep SORT, 注意力机制, CIoU, 车流量统计

Abstract: To address the problem of low accuracy of vehicle volume statistics based on traffic monitoring video, an improved YOLOv5s detector combined with Deep SORT method is proposed.In order to improve the detection rate, the attention module CBAM is integrated with the Neck of the YOLOv5s to improve the feature extraction ability. CIoU Loss is used as the target bounding box regression loss function instead of GIoU Loss to speed up the bounding box regression rate while increasing positioning accuracy. NMS is replaced by DIoU-NMS to reduce the fail of detection when the targets are crowded. The structure of appearance feature extraction network of Deep SORT is refined, and it is retrained on the vehicle re-identification dataset to reduce identity switch caused by target occlusion. The improved YOLOv5s detector is fused with Deep SORT, and a virtual detection line is set in the video to count the traffic flow. The results show that the improved YOLOv5s has an average accuracy of 2.3 percentage points higher than that of the original algorithm. Combined with Deep SORT, the statistical accuracy of traffic flow in off-peak, rush hour, and night scenarios reaches 93.5%, 91.2%, and 89.9%.

Key words: YOLOv5s, Deep SORT, attention mechanism, CIoU, vehicle volume statistics