基于多尺度上下文信息的遮挡行人检测

doi:10.3778/j.issn.1002-8331.2011-0115

摘要/Abstract

摘要： 在遮挡场景下的行人检测一直是计算机视觉中的一个棘手问题，由于被遮挡的行人尺度差异大，可见率低，通常会给检测带来极大的挑战。针对这一问题，提出了一种针对行人遮挡检测的模型结构，对基于anchor-free的行人检测方法进行改进。设计了一种提取多尺度上下文信息的结构，通过级联多个不同扩张率的卷积层，使用密集连接实现多尺度特征共享，提取各个区域的上下文信息来解决遮挡问题。此外，为了提高特征的可分辨性，使用通道注意力机制对多尺度特征融合进行自适应的调整。实验结果表明，该方法在Caltech行人数据集的遮挡子集上实现了41.73%的MR?2，性能优于其他检测算法。

关键词: 行人检测, 多尺度上下文, 通道注意力, anchor-free

Abstract: Pedestrian detection in occluded scenes has always been a thorny problem in computer vision. In this case, due to the large difference in scale of occluded pedestrians and low visibility, it usually brings great challenges to detection. To solve this problem, this paper proposes a model structure for pedestrian occlusion detection, which improves the pedestrian detection method based on anchor-free. First, a structure for extracting multi-scale context information is designed. By cascading multiple convolutional layers with different dilation rates, using dense connections to achieve multi-scale feature sharing, the context information of each region is extracted to solve the occlusion problem. In addition, in order to improve the robustness of features, the multi-scale feature fusion is adaptive adjusted using the channel attention mechanism. Experimental results show that this method achieves 41.73% of MR'2 on the occlusion subset of Caltech pedestrian dataset, which is better than other contrast detectors.

Key words: pedestrian detection, multi-scale context, channel attention, anchor-free

赵世阳, 王晓峰. 基于多尺度上下文信息的遮挡行人检测[J]. 计算机工程与应用, 2022, 58(11): 141-149.

ZHAO Shiyang, WANG Xiaofeng. Occluded Pedestrian Detection Based on Multi-Scale Context Information[J]. Computer Engineering and Applications, 2022, 58(11): 141-149.

参考文献

[1] 芮挺，费建超，方虎生，等.基于深度卷积神经网络的行人检测[J].计算机工程与应用，2016，52（13）：162-166.
RUI T，FEI J C，FANG H S，et al.Pedestrian detection based on deep convolutional neural network[J].Computer Engineering and Applications，2016，52（13）：162-166.
[2] DALAL N，TRIGGS B.Histograms of oriented gradients for human detectio[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition，2005：886-893.
[3] FELZENSZWALB P F，GIRSHICK R B，MCALLESTER D，et al.Object detection with discriminatively trained part-based moels[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2010，32（9）：1627-1645.
[4] GIRSHICK R，DONAHUE J，DARRELL T，et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition，2014：580-587.
[5] GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision，2015：1440-1448.
[6] REN S，HE K，GIRSHICK R，et al.Faster R-CNN：towards real-time object detection with region proposal neworks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence，2016，39（6）：1137-1149.
[7] REDMON J，DIVVALA S，GIRSHICK R，et al.You only look once：unified，real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition，2016：779-788.
[8] LIU W，ANGUELOV D，ERHAN D，et al.SSD：single shot multiBox detector[C]//European Conference on Computer Vision，2016：21-37.
[9] LAW H，DENG J.CornerNet：detecting objects as paired keypoints[J].arXiv：1808.01244，2019.
[10] TIAN Y，LUO P，WANG X，et al.Deep learning strong parts for pedestrian detection[C]//International Conference on Computer Vision，2015：1904-1912.
[11] ZHOU C L，YUAN J S.Bi-box regression for pedestrian detection and occlusion estimation[C]//European Conference on Computer Vision，2018：138-154.
[12] WANG X，XIAO T，JIANG Y，et al.Repulsion loss：detecting pedestrians in a crowd[C]//IEEE Conference on Computer Vision and Pattern Recognition，2018：7774-7783.
[13] ZHANG S S，YANG J，SCHIELE B.Occluded pedestrian detection through guided attention in CNNs[C]//IEEE Conference on Computer Vision and Pattern Recognition，2018：6995-7003.
[14] LIU W，LIAO S，REN W，et al.High-level semantic feature detection：a new perspective for pedestrian detection[C]//IEEE Conference on Computer Vision and Pattern Recognition，2019：5187-5196.
[15] ZHANG J，CHEN Z，TAO，D.Towards high performance human keypoint detectio[J].arXiv：2002.00537，2020.
[16] CHI F，LIU B，CHEN Z，et al.Learning pixel-level and instance-level context-aware features for pedestrian detection in crowds[J].IEEE Access，2019：94944-94953.
[17] SZEGEDY C，LOFFE S，VANHOUCKE V，et al.Inception-v4，inception-resnet and the impact of residual connections on learning[C]//The AAAI Conference on Artificial Intelligence，2017.
[18] CHEN L C，PAPANDREOU G，SCHROFF F，et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv：1706.05587，2017
[19] LIU S，HUANG D，WANG Y.Receptive ?eld block net for accurate and fast object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition，2018：385-400.
[20] HU J，SHEN L，SUN G.Squeeze-and-excitation networks[C]//IEEE Conference on Computer Vision and Pattern Recognition，2018：7132-7141.
[21] LI X，WANG W，HU X，et al.Selective kernel networks[C]//IEEE Conference on Computer Vision and Pattern Recognition，2019：510-519.
[22] WOO S，PARK J，LEE J Y，et al.Cbam：convolutional block attention module[C]//European Conference on Computer Vision，2018：3-19.
[23] HUANG G，LIU Z，LAURENS V D M，et al.Densely connected convolutional networks[C]//IEEE Conference on Computer Vision and Pattern Recognition，2017：4700-4708.
[24] TARVAINEN A，VALPOLA H.Mean teachers are better role models：weight-averaged consistency targets improve semi-supervised deep learning results[C]//Advances in Neural Information Processing Systems，2017：1195-1204.
[25] ZHANG L，LIN L，LIANG X，et al.Is faster R-CNN doing well for pedestrian detection?[C]//European Conference on Computer Vision，2016：443-457.
[26] CAI Z W，FAN Q F.A uni?ed multi-scale deep convolutional neural network for fast object detection[C]// European Conference on Computer Vision，2016.
[27] BRAZIL G，YIN X，LIU X.Illuminating pedestrians via simultaneous detection & segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition，2017：4950-4959.
[28] LIN C Z，LU J W，WANG G，et al.Graininess-aware deep feature learning for pedestrian detection[C]//European Conference on Computer Vision，2018.
[29] LIU W，LIAO S，HU W，et al.Learning ef?cient single-stage pedestrian detectors by asymptotic localization fitting[C]//European Conference on Computer Vision，2018：618-634.