融合注意力机制与权重聚类学习的行人再识别

doi:10.3778/j.issn.1002-8331.2103-0311

摘要/Abstract

摘要： 行人图像在行人再识别中常通过行人检测器自动检测获得，不仅包含行人主体，还包含一些干扰信息（比如，背景、遮挡等）。在基于注意力机制的行人再识别中，增强了对具有显著性特征行人部件的关注，削弱了对带有干扰信息部件的关注，有利于提取更具辨别力的行人特征表示。在深度学习中，卷积神经网络通过对特征映射重新赋权值，得到注意力特征，提出了一种新颖的基于聚类的全局注意力模块（cluster-based global attention module，CGAM）。在CGAM中，将注意力权重学习过程重新考虑为聚类中心学习过程，将特征映射中的空间位置点视为特征节点，通过聚类算法得到每个特征节点的重要分数并进行归一化后作为注意力权重。利用改进的Resnet50作为基本框架，嵌入注意力模块，得到注意力网络，仅使用了全局分支，具有简单高效特点。综上，基于聚类的注意力设计不仅充分利用了特征节点之间的成对相关性，而且挖掘了丰富的全局结构信息，得到一组更可信的注意力权重。实验结果表明，提出的行人再识别算法在Market-1501和DukeMTMC-reID两个流行数据集上均有显著的效果。

关键词: 行人再识别, 深度学习, 注意力网络, 注意力权重, 聚类算法

Abstract: Pedestrian images in person re-identification（re-ID） are often obtained by automatic detection of pedestrian detectors, which contain the human body and some interference information （e.g. background, occlusion, etc.）. For re-ID based on the attention mechanism, the attention to the person parts with salient features is increased, and the attention to the parts with interference information is weakened, which is helpful to extracting more discriminative person feature representations. In deep learning, by re-weighting the feature map to obtain attentional feature, a novel cluster-based global attention module（CGAM） is proposed, the attention weight learning process is reconsidered as the clustering center learning process, the spatial location points in the feature map are regarded as feature nodes, and the important score of each feature node is obtained through the clustering algorithm and normalized as the attention weight. It uses the improved Resnet50 as the backbone network and embeds the attention module to get the attention network. Only the global branch is used, which is simple and efficient. In summary, the cluster-based attention design not only makes full use of the pairwise correlation between feature nodes, but also mines rich global structural information to obtain a set of more credible attention weights. The experimental results show that the proposed method has significant effects on the two popular datasets Market-1501 and DukeMTMC-reID.

Key words: person re-identification, deep learning, attentional network, attentional weight, clustering algorithm

孙姣, 杨有龙, 车金星. 融合注意力机制与权重聚类学习的行人再识别[J]. 计算机工程与应用, 2022, 58(20): 157-164.

SUN Jiao, YANG Youlong, CHE Jinxing. Person Re-Identification Combining Attention Mechanism and Weight Clustering Learning[J]. Computer Engineering and Applications, 2022, 58(20): 157-164.

参考文献

[1] GONG S，XIANG T.Person re-identification[M]//Visual analysis of behaviour.London：Springer，2011：301-313.
[2] ZHENG L，YANG Y，HAUPTMANN A G.Person re-identification：past，present and future[J].arXiv：1610.02984，2016.
[3] GONG S，CRISTANI M，LOY C C，et al.The re-identification challenge[M]//Person re-identification.London：Springer，2014：1-20.
[4] GRAY D，TAO H.Viewpoint invariant pedestrian recognition with an ensemble of localized features[C]//European Conference on Computer Vision.Berlin，Heidelberg：Springer，2008.
[5] LI Z，CHANG S，LIANG F，et al.Learning locally-adaptive decision functions for person verification[C]//IEEE Conference on Computer Vision and Pattern Recognition，2013.
[6] LIAO S，HU Y，ZHU X，et al.Person re-identification by local maximal occurrence representation and metric learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2015：2197-2206.
[7] K?STINGER M，HIRZER M，WOHLHART P，et al.Large scale metric learning from equivalence con straints[C]//IEEE Conference on Computer Vision and Pattern Recognition，2012：2288-2295.
[8] WEINBERGER K Q，SAUL L K.Distance metric learning for large margin nearest neighbor classification[J].Journal of Machine Learning Research，2009，10（1）：207-244.
[9] DAVIS J V，KULIS B，JAIN P，et al.Information theoretic metric learning[C]//24th International Conference on Machine Learning，2007：209-216.
[10] KRIZHEVSKY A，SUTSKEVER I，HINTON G.ImageNet classification with deep convolutional neural networks[C]//Proceedings of NIPS，2012.
[11] WU L，SHEN C，HENGEL A V.PersonNet：person re-identification with deep convolutional neural networks[J].arXiv：1601.07255，2016.
[12] WANG F，ZUO W，LIN L，et al.Joint learning of single-image and cross-image representations for person re-identification[C]//IEEE Conference on Computer Vision and Pattern Recognition，2016：1288-1296.
[13] ZHENG L，ZHANG H，SUN S，et al.Person re-identification in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：1367-1376.
[14] QIAN X，FU Y，JIANG Y G，et al.Multi-scale deep learning architectures for person re-identification[C]//2017 IEEE International Conference on Computer Vision（ICCV），2017.
[15] KALAYEH M M，BASARAN E，G?KMEN M，et al.Human semantic parsing for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：1062-1071.
[16] SU C，LI J，ZHANG S，et al.Pose-driven deep convolutional model for person re-identification[C]//2017 IEEE International Conference on Computer Vision（ICCV），2017.
[17] SUN Y，ZHENG L，YANG Y，et al.Beyond part models：person retrieval with refined part pooling （and a strong convolutional baseline）[C]//Proceedings of the European Conference on Computer Vision，2018：480-496.
[18] WANG H，GONG S，XIANG T.Highly efficient regression for scalable person re-identification[C]//IEEE Conference on Computer Vision and Pattern Recognition，2016.
[19] ZHENG Z，ZHENG L，YANG Y.Unlabeled samples generated by GAN improve the person re-identification baseline in vitro[C]//Proceedings of the IEEE International Conference on Computer Vision，2017：3754-3762.
[20] VASWANI A，SHAZEER N，PARMAR N，et al.Attention is all you need[J].arXiv：1706.03762，2017.
[21] WOO S，PARK J，LEE J Y，et al.CBAM：convolutional block attention module[C]//Lecture Notes in Computer Science，2018.
[22] LIU X，ZHAO H，TIAN M，et al.Hydraplus-net：attentive deep features for pedestrian analysis[C]//IEEE International Conference on Computer Vision，2017：350-359.
[23] LI W，ZHU X，GONG S.Harmonious attention network for person re-identification[C]//IEEE Conference on Computer Vision and Pattern Recognition，2018：2285-2294.
[24] LI S，BAK S，CARR P，et al.Diversity regularized spatiotemporal attention for video-based person re-identification[C]//the IEEE Conference on Computer Vision and Pattern Recognition，2018：369-378.
[25] XU J，ZHAO R，ZHU F，et al.Attention-aware compositional network for person re-identification[C]//IEEE Conference on Computer Vision and Pattern Recognition，2018：2119-2128.
[26] FU Y，WANG X，WEI Y，et al.STA：spatial-temporal attention for large-scale video-based person re-identification[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2019：8287-8294.
[27] FANG P，ZHOU J，ROY S，et al.Bilinear attention networks for person retrieval[C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV），2019.
[28] CHEN B，DENG W，HU J.Mixed high-order attention network for person re-identification[C]//IEEE International Conference on Computer Vision，2019：371-381.
[29] WANG X，GIRSHICK R，GUPTA A，et al.Non-local neural networks[C]//IEEE Conference on Computer Vision and Pattern Recognition，2018：7794-7803.
[30] WANG C，ZHANG Q，HUANG C，et al.Mancs：a multi-task attentional network with curriculum sampling for person re-identification[C]//IEEE International Conference on Computer Vision，2018：365-381.
[31] WANG F，JIANG M，QIAN C，et al.Residual attention network for image classification[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2017.
[32] LUO W，LI Y，URTASUN R，et al.Understanding the effective receptive field in deep convolutional neural networks[J].arXiv：1701.04128，2017.
[33] LI D，CHEN X，ZHANG Z，et al.Learning deep context-aware features over body and latent parts for person re-identification[C]//IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2017.
[34] XU K，BA J，KIROS R，et al.Show，attend and tell：neural image caption generation with visual attention[J].arXiv：1502.03044，2015.
[35] SHEN Y，XIAO T，LI H，et al.End-to-end deep kronecker-product matching for person re-identification[C]//IEEE Conference on Computer Vision and Pattern Recognition，2018：6886-6895.
[36] SONG C，HUANG Y，OUYANG W，et al.Mask-guided contrastive attention model for person re-identification[C]//IEEE Conference on Computer Vision and Pattern Recognition，2018：1179-1188.
[37] JADERBERG M，SIMONYAN K，ZISSERMAN A，et al.Spatial transformer networks[J].arXiv：1506.02025，2015.
[38] MACQUEEN J.Some methods for classification and analysis of multivariate observations[C]//Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability，1967：281-297.
[39] YANG F，YAN K，LU S，et al.Attention driven person re-identification[J].Pattern Recognition，2019，86：143-155.
[40] ESTER M，KRIEGEL H P，SANDER J，et al.A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Proceedings of KDD，1996：226-231.
[41] RODRIGUEZ A，LAIO A.Clustering by fast search and find of density peaks[J].Science，2014，344（6191）：1492-1496.
[42] ZHANG Y，XIA Y，LIU Y，et al.Clustering sentences with density peaks for multi-document summarization[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics：Human Language Technologies，2015：1262-1267.
[43] FAYYAZ M，YASMIN M，SHARIF M，et al.Person re-identification with features-based clustering and deep features[J].Neural Computing and Applications，2019：1-22.
[44] HE K，ZHANG X，REN S，et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：770-778.
[45] HERMANS A，BEYER L，LEIBE B.In defense of the triplet loss for person re-identification[J].arXiv：1703.07737，2017.
[46] ZHONG Z，ZHENG L，KANG G，et al.Random erasing data augmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2020：13001-13008.
[47] KINGMA D P，BA J.Adam：a method for stochastic optimization[J].arXiv：1412.6980，2014.
[48] FAN X，JIANG W，LUO H，et al.Spherereid：deep hypersphere manifold embedding for person re-identification[J].Journal of Visual Communication and Image Representation，2019，60：51-58.
[49] LUO H，GU Y，LIAO X，et al.Bag of tricks and a strong baseline for deep person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops，2019.
[50] ZHENG Z，ZHENG L，YANG Y.Pedestrian alignment network for large-scale person re-identification[J].IEEE Transactions on Circuits and Systems for Video Technology，2018，29（10）：3037-3045.
[51] SUN Y，ZHENG L，DENG W，et al.Svdnet for pedestrian retrieval[C]//Proceedings of the IEEE International Conference on Computer Vision，2017：3800-3808.
[52] LI W，ZHU X，GONG S.Person re-identification by deep joint learning of multi-loss classification[J].arXiv：1705.
04724，2017.
[53] CHANG X，HOSPEDALES T M，XIANG T.Multi-level factorisation net for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：2109-2118.

编辑推荐 0

Metrics

阅读次数

全文

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	58

	来源	本网站

	次数	58
	比例	100%

摘要

107

最新录用	在线预览	正式出版

0	0	107

	来源	本网站

	次数	107
	比例	100%