计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (13): 235-240.DOI: 10.3778/j.issn.1002-8331.2012-0395

• 图形图像处理 • 上一篇    下一篇

双端可共享网络的多模态行人重识别方法

罗琪,焦明海   

  1. 东北大学 计算机科学与工程学院,沈阳 110000
  • 出版日期:2022-07-01 发布日期:2022-07-01

Multi-Modal Pedestrian Recognition on Double-Terminal Shared Network

LUO Qi, JIAO Minghai   

  1. School of Computer Science and Engineering, Northeastern University, Shenyang 110000, China
  • Online:2022-07-01 Published:2022-07-01

摘要: 针对多模态行人重识别中存在较大的类内差异和模态差异的问题,提出了一种使用双端共享网络的多模态行人重识别方法。通过裁剪和填充对不同模态的图片进行数据处理;将Resnet50的后4个卷积层中嵌入非局部注意力块,使用改进的Resnet50作为骨干网络分别对不同模态的图片进行特征提取,再将不同的特征输入共享网络;最后使用基于类内距离和模态差异的聚类损失对模型进行训练。实验结果表明,使用非局部注意力块和聚类损失的模型准确率有所提升,且模型更具有鲁棒性。

关键词: 多模态行人重识别, 卷积神经网络, 聚类损失

Abstract: In order to solve the problem of intra-class difference and modal difference in multi-modal pedestrian recognition, a multi-modal pedestrian recognition method using double-terminal shared network is proposed. Firstly, data processing is carried out for images with different modes by cropping and filling. Then, non-local attention blocks are embedded in the last four convolutional layers of Resnet50, and the improved Resnet50 is used as the backbone network to extract features of pictures with different modes respectively, and then different features are input into the sharing network. Finally, the model is trained by using the clustering loss based on in-class distance and modal difference. Experimental results show that the model with non-local block and clustering loss is more accurate, and the model is more robust.

Key words: multimodal person re-identification, convolutional neural network, cluster loss