计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (23): 132-141.DOI: 10.3778/j.issn.1002-8331.2105-0337

• 模式识别与人工智能 • 上一篇    下一篇

XSSD-P:改进的SSD行人检测算法

鲍文斌,张冬泉   

  1. 北京交通大学 机械与电子控制工程学院,北京 100044
  • 出版日期:2022-12-01 发布日期:2022-12-01

XSSD-P:Improved SSD Pedestrian Detection Algorithm

BAO Wenbin, ZHANG Dongquan   

  1. School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing 100044, China
  • Online:2022-12-01 Published:2022-12-01

摘要: SSD(single shot multi-box detector)是目前广泛应用于行人检测的神经网络算法,为了提高其检测精度和检测速度,对SSD算法进行了有效改进(改进后的算法称为XSSD-P)。选择Xception网络作为XSSD-P算法的骨干网络并重新选择用于预测的特征层;根据行人外形尺寸的特征设计了多尺度卷积核和基础锚框,并将二者耦合,基础锚框通过调节自身大小得到锚框(anchors)用于位置回归;再使用深度可分离卷积代替常规卷积在特征图上进行预测,实现了行人的有效检测。在INRIA数据集、VOC数据集和COCO数据集上进行检测精度对比测试,与SSD以及其他主流算法相比,XSSD-P算法在行人检测方面拥有更高的检测精度,并在Caltech行人数据集和MIT行人数据集中验证了XSSD-P算法的泛化性能。在检测速度方面,与SSD算法相比,XSSD-P算法的检测速度高出30?FPS,提高了42.86%。实验结果表明,XSSD-P的检测精度和检测速度均优于SSD算法。

关键词: 行人检测, SSD算法, 卷积神经网络, 多尺度卷积核, Xception网络

Abstract: SSD(single shot multi-box detector) is a neural network algorithm widely used in pedestrian detection. In order to improve its detection accuracy and detection speed, the SSD has been effectively improved(the improved algorithm is called XSSD-P). First, the Xception network is selected as the backbone network of the XSSD-P algorithm and the feature layers used for prediction are re-selected. According to the appearance characteristics of pedestrian, multi-scale convolution kernels and basic anchors are designed, and the two are coupled. The basic anchor adjusts its size to obtain anchors for position regression. Finally, depthwise separable convolution is used instead of conventional convolution to predict on the feature map. The above improvement realizes the effective detection of pedestrians. The detection accuracy comparison test is carried out on the INRIA dataset, VOC dataset and COCO dataset. Compared with SSD and other mainstream algorithms, the XSSD-P has higher detection accuracy in pedestrian detection. And the generalization ability of the XSSD-P algorithm is verified in the Caltech pedestrian dataset and the MIT pedestrian dataset. In terms of detection speed, compared with the SSD, the detection speed of the XSSD-P is 30?FPS higher, an increase of 42.86%. Experimental results show that the detection accuracy and detection speed of XSSD-P are better than those of SSD.

Key words: pedestrian detection, single shot multi-box detector(SSD) algorithm, convolutional neural network, multi-scale convolution kernel, Xception network