计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (4): 176-182.DOI: 10.3778/j.issn.1002-8331.1912-0174

• 图形图像处理 • 上一篇    下一篇

多尺度特征融合重建的行人检测方法

李佐龙,王帮海,卢增   

  1. 广东工业大学 计算机学院,广州 510006
  • 出版日期:2021-02-15 发布日期:2021-02-06

Pedestrian Detection Method Based on Multi-scale Feature Fusion and Reconstruction

LI Zuolong, WANG Banghai, LU Zeng   

  1. School of Computer, Guangdong University of Technology, Guangzhou 510006, China
  • Online:2021-02-15 Published:2021-02-06

摘要:

行人在众多场景中都存在多尺度变化问题,严重影响检测器的精度,为此设计卷积特征重建和通道注意力两种模块来增强对多尺度行人的检测效果。以原始输入的多尺度特征为基础融合重建多个特征金字塔,然后融合多个特征金字塔中的相同尺度特征,并学习每层特征的通道注意力权值来增加有效通道层权重,由此得到的特征才用于最后的检测。将这两种模块集成到RFBnet模型中,并改进模型损失函数用以优化对遮挡行人的检测效果。在Caltech-USA、INRIA和ETH三个数据集上的测试结果表明,新方法的准确率高于RFBnet和MS-CNN等一些多尺度方法,在不同尺度行人的测试子集上达到了最优的检测效果。

关键词: 行人检测, 卷积神经网络, 多尺度特征, 遮挡处理

Abstract:

Multi-scale changes of pedestrians in many scenes seriously affect the accuracy of the detector, therefore, two modules of convolution feature reconstruction and channel attention are designed to enhance the detection effect of multi-scale pedestrians. Feature pyramids are reconstructed based on the multi-scale features of the original input and feature fusion. Then, the same scale features in multiple feature pyramids are fused to learn the channel attention of each feature layer, and the effective channel layer weight is increased by the weight, so that the features obtained can be used for the final detection. The two modules are integrated into the RFBnet model, and the model loss function is improved to optimize the detection effect of occluded pedestrians. The test results of Caltech-USA, INRIA and ETH data sets show that the accuracy of the new method is higher than that of some multi-scale methods such as RFBnet and MS-CNN, achieving the optimal detection effect on the test subsets of multi-scale pedestrians.

Key words: pedestrian detection, convolutional neural network, multi-scale feature, occlusion handling