计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (21): 149-155.DOI: 10.3778/j.issn.1002-8331.2103-0325

• 模式识别与人工智能 • 上一篇    下一篇

语义保持哈希在跨模态检索中的应用

康培培,林泽航,杨振国,张子同,刘文印   

  1. 1.广东工业大学 计算机学院,广州 510006
    2.香港理工大学 计算机学院,香港 999077 3.华南农业大学 数学与信息学院,广州 510642
  • 出版日期:2022-11-01 发布日期:2022-11-01

Semantic Preserving Hash for Cross-Modal Retrieval

KANG Peipei, LIN Zehang, YANG Zhenguo, ZHANG Zitong, LIU Wenyin   

  1. 1.School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
    2.Department of Computing, Hong Kong Polytechnic University, Hong Kong 999077, China 
    3.College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
  • Online:2022-11-01 Published:2022-11-01

摘要: 哈希表示能够节省存储空间,加快检索速度,所以基于哈希表示的跨模态检索已经引起广泛关注。多数有监督的跨模态哈希方法以一种回归或图约束的方式使哈希编码具有语义鉴别性,然而这种方式忽略了哈希函数的语义鉴别性,从而导致新样本不能获得语义保持的哈希编码,限制了检索准确率的提升。为了同时学习具有语义保持的哈希编码和哈希函数,提出一种语义保持哈希方法用于跨模态检索。通过引入两个不同模态的哈希函数,将不同模态空间的样本映射到共同的汉明空间。为使哈希编码和哈希函数均具有较好的语义鉴别性,引入了语义结构图,并结合局部结构保持的思想,将哈希编码和哈希函数的学习融合到同一个框架,使两者同时优化。三个多模态数据集上的大量实验证明了该方法在跨模态检索任务的有效性和优越性。

关键词: 跨模态检索, 跨模态哈希, 语义保持, 有监督学习

Abstract: Due to the low storage and high speed of hash representation, hash based cross-modal retrieval has aroused considerable attention. Most of the supervised cross-modal hashing methods learn semantic discriminant hash codes by regression or the graph constraint. However, this kind of methods ignore the semantic discrimination of hash functions, making the out-of-sample data unable to acquire semantic preserving hash codes, and limit the accuracy of cross-modal retrieval. In order to simultaneously learn semantic preserving hash codes and hash functions, this paper proposes the semantic preserving hash(SPH) for cross-modal retrieval. SPH introduces two hash functions that project data in cross-modal spaces into the common Hamming space. And to enhance the discrimination of both hash codes and hash functions, the semantic graph is brought in. Combining the theory of locality preserving, SPH fuses the hash codes learning and hash functions learning into one common framework and optimizes them together. Experiments on three public multimodal datasets show the effectiveness and superiority of SPH on the task of cross-modal retrieval.

Key words: cross-modal retrieval, cross-modal hashing, semantic preserving, supervised learning