Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (24): 205-211.DOI: 10.3778/j.issn.1002-8331.2106-0517

• Graphics and Image Processing • Previous Articles     Next Articles

Multi-Attention Ensemble for Image Retrieval

ZENG Aibo, CHEN Youguang   

  1. 1.School of Computer Science and Technology, East China Normal University, Shanghai 200062, China
    2.School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
  • Online:2022-12-15 Published:2022-12-15

多注意力集成的图像检索

曾爱博,陈优广   

  1. 1.华东师范大学 计算机科学与技术学院,上海 200062
    2.华东师范大学 数据科学与工程学院,上海 20006

Abstract: Aiming at the problem that the features, which are output by second-order attention based on the relations among all input features, are full of redundant information and each branch in ensemble methods can not be effectively trained, a multi-attention ensemble method is proposed for image retrieval. This method utilizes the SASA(stand-alone self-attention), which performs well in image classification task, to capture the relations among every feature and its neighborhoods to produce more powerful features for retrieval. This method proposes a multi-attention ensemble framework to generate effective features from every attentional branch with SASA. These features are used to effectively combine into the final image feature. Moreover, this framework uses a ranking loss from the final image feature, divergence loss from all branches, and classification losses from each branch to jointly train the model. Experiments on CUB200-2011 and CARS196 retrieval datasets demonstrate that the proposed method can significantly improve retrieval accuracy.

Key words: stand-alone self-attention, attention ensemble, image retrieval

摘要: 针对图像检索方法中二阶注意力模块使用全局特征之间的联系所生成的特征存在大量冗余信息,以及集成机制中各分支不能充分训练的问题,提出一种基于多注意力集成的图像检索方法。该方法利用在图像分类任务中表现良好的独立自注意力模块捕捉局部特征之间的联系,生成质量更高的特征以用于图像检索。该方法提出一个多注意力集成框架,在各注意力分支中分别利用独立自注意力模块产生相应的高效图像特征,并通过有效结合产生最终的图像特征。多注意力集成框架利用最终图像特征的排序损失、各注意力分支之间的差异损失及各分支的图像分类损失对模型进行联合训练,使各分支能得到充分训练。在CUB200-2011及CARS196图像检索数据集上的实验表明,所提方法可以有效提高检索精度。

关键词: 独立自注意力, 注意力集成, 图像检索