融合有效掩膜和局部增强的遮挡行人重识别

doi:10.3778/j.issn.1002-8331.2304-0339

摘要/Abstract

摘要： 在监控系统中行人经常会被各种障碍物遮挡，使得遮挡行人重识别仍然是一个长期存在的挑战。最近一些基于Transformer和外部语义线索的方法都改善了特征的表示和相关性能，但仍存在表示弱和语义线索不可靠等问题。为解决上述问题，提出了一种基于Transformer的新方法。引入了一种有效的掩膜生成方式，可靠的掩膜可以使模型不依赖外部语义线索并实现自动对齐。提出了一种基于平均注意力分数的序列重建模块，可以更有效地关注前景信息。提出了局部增强模块，获得了更鲁棒的特征表示。比较了所提方法和现有的各种方法在Occluded-Duke，Occluded-ReID，Partial-ReID，Market-1501数据集上的性能。Rank-1准确率分别达到了72.3%、84.8%、86.5%和95.6%，mAP精度分别为62.9%、83.2%、76.4%和89.9%，实验结果表明所提模型性能较其他先进网络有所提升。

关键词: 遮挡行人重识别, 原型掩膜, 特征注意力机制, 平均注意力分数, 局部增强, Transformer

Abstract: Human body is often occluded by a variety of obstacles in the monitoring system, so occluded person re-identi?cation is still a long-standing challenge. Recent methods based on Transformer and external semantic clues have improved feature representation and related performance, but there are still problems with weak representation and unreliable semantic clues. To solve the above problems, a novel method based on Transformer is proposed. Firstly, a more efficient way to generate masks is introduced. Reliable masks allow models to be independent of external semantic clues and to achieve automatic alignment. Secondly, a sequence reconstruction module based on average attention score is proposed, which can focus on foreground information more effectively. Thirdly, it proposes a local enhancement module to obtain more robust feature representation. Finally, performance of the propose method and various existing methods are compared on the Occluded-Duke, Occluded-ReID, Partial-ReID and Market-1501 datasets. The accuracy of Rank-1 of reaches 72.3%, 84.8%, 86.5% and 95.6%, respectively, the mAP accuracy are 62.9%, 83.2%, 76.4% and 89.9%. Experimental results demonstrate that the performance of the propose model is improved compared with other advanced networks.

Key words: occluded person re-identification, prototype mask, features attention mechanism, average attention score, local enhancement, Transformer

王小檬, 梁凤梅. 融合有效掩膜和局部增强的遮挡行人重识别[J]. 计算机工程与应用, 2024, 60(11): 156-164.

WANG Xiaomeng, LIANG Fengmei. Effective Mask and Local Enhancement for Occluded Person Re-Identification[J]. Computer Engineering and Applications, 2024, 60(11): 156-164.

参考文献

[1] 罗浩, 姜伟, 范星, 等. 基于深度学习的行人重识别研究进展[J]. 自动化学报, 2019, 45(11): 2032-2049.
LUO H, JIANG W, FAN X, et al. A survey on deep learning based person re-identification[J]. Acta Automatica Sinica, 2019, 45(11): 2032-2049.
[2] ZHANG X, LUO H, FAN X, et al. Alignedreid: surpassing human-level performance in person re-identification[J]. arXiv:1711.08184, 2017.
[3] MIAO J, WU Y, LIU P, et al. Pose-guided feature alignment for occluded person re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 542-551.
[4] GAO S, WANG J, LU H, et al. Pose-guided visible part matching for occluded person ReID[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 11744-11752.
[5] 许茹玉, 吴琳, 粟兴旺, 等. 多样细粒度特征与关系网络驱动的行人重识别[J]. 计算机工程与应用, 2023, 59(19): 211-219.
XU R Y, WU L, SU X W, et al. Person re-identification driven by diverse fine-grained features and relation network[J]. Computer Engineering and Applications, 2023, 59(19): 211-219.
[6] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[J]. arXiv:2010.11929, 2020.
[7] HE S, LUO H, WANG P, et al. TransReID: transformer-based object re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 15013-15022.
[8] TAN L, DAI P, JI R, et al. Dynamic prototype mask for occluded person re-identification[C]//Proceedings of the 30th ACM International Conference on Multimedia, 2022: 531-540.
[9] LUO H, GU Y, LIAO X, et al. Bag of tricks and a strong baseline for deep person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019: 1487-1495.
[10] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2818-2826.
[11] HERMANS A, BEYER L, LEIBE B. In defense of the triplet loss for person re-identification[J]. arXiv:1703.07737, 2017.
[12] DENG J, GUO J, XUE N, et al. ArcFace: additive angular margin loss for deep face recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 4690-4699.
[13] ZHUO J, CHEN Z, LAI J, et al. Occluded person re-identification[C]//Proceedings of the IEEE International Conference on Multimedia and EXPO, 2018: 1-6.
[14] ZHENG W S, LI X, XIANG T, et al. Partial person re-identification[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 4678-4686.
[15] ZHENG L, SHEN L, TIAN L, et al. Scalable person re-identification: a benchmark[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1116-1124.
[16] ZHAO L, LI X, ZHUANG Y, et al. Deeply-learned part-aligned representations for person re-identification[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 3219-3228.
[17] SUN Y, ZHENG L, YANG Y, et al. Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline)[C]//Proceedings of the European Conference on Computer Vision, 2018: 480-496.
[18] HE L, LIANG J, LI H, et al. Deep spatial feature reconstruction for partial person re-identification: alignment-free approach[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7073-7082.
[19] ZHU K, GUO H, LIU Z, et al. Identity-guided human semantic parsing for person re-identification[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2020: 346-363.
[20] JIA M, CHENG X, ZHAI Y, et al. Matching on sets: conquer occluded person re-identification without alignment[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(2): 1673-1681.
[21] SUH Y, WANG J, TANG S, et al. Part-aligned bilinear representations for person re-identification[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 402-419.
[22] WANG G, YANG S, LIU H, et al. High-order information matters: learning relation and topology for occluded person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 6449-6458.
[23] LI Y, HE J, ZHANG T, et al. Diverse part discovery: occluded person re-identification with part-aware transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 2898-2907.
[24] WANG T, LIU H, SONG P, et al. Pose-guided feature disentangling for occluded person re-identification based on transformer[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(3): 2540-2549.
[25] HE L, SUN Z, ZHU Y, et al. Recognizing partial biometric patterns[J]. arXiv:1810.07399, 2018.
[26] SUN H, CHEN Z, YAN S, et al. MVP matching: a maximum-value perfect matching for mining hard samples, with application to person re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 6737-6747.
[27] LUO C, CHEN Y, WANG N, et al. Spectral feature transformation for person re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 4976-4985.
[28] KALAYEH M M, BASARAN E, G?KMEN M, et al. Human semantic parsing for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 1062-1071.
[29] LIU J, NI B, YAN Y, et al. Pose transferrable person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 4099-4108.
[30] MA Z, ZHAO Y, LI J. Pose-guided inter-and intra-part relational transformer for occluded person re-identification[C]//Proceedings of the 29th ACM International Conference on Multimedia, 2021: 1487-1496.