计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (4): 229-236.DOI: 10.3778/j.issn.1002-8331.2009-0298

• 图形图像处理 • 上一篇    下一篇

改进的Cascade RCNN行人检测算法研究

刘艳萍,刘甜   

  1. 河北工业大学 电子信息工程学院,天津 300401
  • 出版日期:2022-02-15 发布日期:2022-02-15

Improved Cascade RCNN Pedestrian Detection Algorithm Research

LIU Yanping, LIU Tian   

  1. School of Electronic Information Engineering, Hebei University of Technology, Tianjin 300401, China
  • Online:2022-02-15 Published:2022-02-15

摘要: 在复杂路况下的行人检测中,行人尺寸变化大,导致小尺寸行人漏检率高,增加了行人检测的难度。为了降低行人检测漏检率,提高行人检测精度,在级联区域卷积神经网络(cascade regional convolutional neural network,Cascade RCNN)的基础上,将浅层特征与深层特征融合,进行深层特征对浅层特征的特征增强,提高深层信息的利用率,并且增加了一条浅层到深层的通道,将浅层信息直接向上进行传递,提高浅层空间信息的利用率;将行人分类和预测框回归的全连接层改为解耦的回归与分类分支,更加稳健地进行分类和回归整个边界框。在Caltech和ETH行人数据集上进行实验,结果表明,改进的Cascade RCNN与原Cascade RCNN相比,在Caltech行人数据集中大中小尺寸行人漏检率分别降低了7.9个百分点、11.4个百分点和9.1个百分点,平均精度均值提高了3.0个百分点;在ETH行人数据集中漏检率降低了5.6个百分点,平均精度均值提高了2.3个百分点。

关键词: 行人检测, 特征增强, 特征金字塔, 全连接层, 解耦的回归与分类分支

Abstract: In pedestrian detection under complex road conditions, the size of pedestrian varies greatly, and the missing rate of small-size pedestrians is high, which increases the difficulty of pedestrian detection. In order to reduce the missing rate of pedestrian detection, based on Cascade RCNN(cascade regional convolutional neural network), low-level feature maps are fused with high-level feature maps, which enhances the feature of the low-level feature maps from the high-level feature maps, and improves the utilization of high-level information. It also adds a bottom-up pathway to transmit the low-level feature maps information directly upwards to improve the utilization of low-level feature maps spatial information. The fully connection layer of pedestrian classification and regression of bounding box is changed to the decoupled regression and classification branch, and the more robust classification and regression of the whole bounding box are adopted. This paper conducts experiments on the Caltech and ETH pedestrian datasets. The results show that compared with the original Cascade RCNN, the improved Cascade RCNN reduces the missing rate of large, medium and small pedestrians on the Caltech pedestrian dataset by 7.9 percentage points, 11.4 percentage points and 9.1 percentage points, respectively. The mean average precision is increased by 3.0 percentage points; the missing rate in the ETH pedestrian dataset is reduced by 5.6 percentage points, and the mean average precision is increased by 2.3 percentage points.

Key words: pedestrian detection, feature enhancement, feature pyramid, fully connection layer, decoupled regression and classification branch