Computer Engineering and Applications ›› 2019, Vol. 55 ›› Issue (2): 12-20.DOI: 10.3778/j.issn.1002-8331.1810-0333

Previous Articles     Next Articles

Fast Vehicle Detection Method Based on Improved YOLOv3

ZHANG Fukai, YANG Feng, LI Ce   

  1. School of Mechanical Electronic and Information Engineering, China University of Mining and Technology(Beijing), Beijing 100083, China
  • Online:2019-01-15 Published:2019-01-15

基于改进YOLOv3的快速车辆检测方法

张富凯,杨  峰,李  策   

  1. 中国矿业大学(北京) 机电与信息工程学院,北京 100083

Abstract: Vehicle detection on image or video data is an important but challenging task for urban traffic surveillance. The difficulty of this task is to accurately locate and classify relatively small vehicles in complex scenes. In response to these problems, this paper presents a single deep neural network(DF-YOLOv3) for fast detecting vehicles with different types in urban traffic surveillance. DF-YOLOv3 improves the conventional YOLOv3 by first enhancing the residual network to extract vehicle features, then designing 6 different scale convolution feature maps and merging with the corresponding feature maps in the previous residual network, to form the final feature pyramid for performing vehicle prediction. Experimental results on the KITTI dataset demonstrate that the proposed DF-YOLOv3 can achieve efficient detection performance in terms of accuracy and speed. Specifically, for the 512×512 input model, using NVIDIA GTX 1080Ti GPU, DF-YOLOv3 achieves 93.61% mAP(mean average precision) at the speed of 45.48 f/s(frames per second). Especially, as for accuracy, DF-YOLOv3 performances better than those of Fast R-CNN, Faster R-CNN, DAVE, YOLO, SSD, YOLOv2, YOLOv3 and SINet.

Key words: vehicle detection, feature fusion, convolutional neural network, real-time detection, YOLOv3

摘要: 对图像或视频数据中的车辆进行检测是城市交通监控中非常重要并且具有挑战性的任务。该任务的难度在于对复杂场景中相对较小的车辆进行精准地定位和分类。针对这些问题,提出了一个单阶段的深度神经网络(DF-YOLOv3),实现城市交通监控中不同类型车辆的实时检测。DF-YOLOv3对传统的YOLOv3算法进行改进,首先增强深度残差网络提取车辆特征,然后设计6个不同尺度的卷积特征图,并与残差网络中相应尺度的特征图进行融合,形成最终的特征金字塔执行车辆预测任务。在KITTI数据集上的实验表明,提出的DF-YOLOv3方法在精度和速度上均能获得较高的检测性能。具体地,对于512×512分辨率的输入模型,基于英伟达1080Ti GPU,DF-YOLOv3获得93.61%的mAP(均值平均精度),速度达到45.48 f/s(每秒传输帧数)。特别地,对于精度,DF-YOLOv3比Fast R-CNN、Faster R-CNN、DAVE、YOLO、SSD、YOLOv2、YOLOv3与SINet表现更好。

关键词: 车辆检测, 特征融合, 卷积神经网络, 实时检测, YOLOv3