自适应高效深度跨模态增量哈希检索算法

doi:10.3778/j.issn.1002-8331.2108-0349

摘要/Abstract

摘要： 针对现阶段深度跨模态哈希检索算法无法较好地检索训练数据类别以外的数据及松弛哈希码离散化约束造成的次优解等问题，提出自适应深度跨模态增量哈希检索算法，保持训练数据的哈希码不变，直接学习新类别数据的哈希码。同时，将哈希码映射到潜在子空间中保持多模态数据之间的相似性和非相似性，并提出离散约束保持的跨模态优化算法来求解最优哈希码。此外，针对目前深度哈希算法缺乏有效的复杂度评估方法，提出基于神经网络神经元更新操作的复杂度分析方法，比较深度哈希算法的复杂度。公共数据集上的实验结果显示，所提算法的训练时间低于对比算法，同时检索精度高于对比算法。

关键词: 增量学习, 哈希编码, 语义保持, 潜在空间, 跨模态检索

Abstract: To address the problems that current deep learning-based cross-modal hash retrieval algorithms cannot retrieve new category data and sub-optimal solution caused by relaxing discretization constraint of hash codes, an adaptive deep incremental hashing（ADIH） retrieval algorithm is proposed to directly learn the hash codes of newly coming data meanwhile keeping the old trained data unchanged. In order to preserve the similarity and dissimilarity among multi-modal data, hash codes will be projected into latent semantic space where binary constrained discrete cross-modal hash algorithm is introduced to optimize hash code without using any relaxation. Besides, considering that there is currently no effective method which can be used to evaluate complexity of deep hashing methods, a novel method based on neuron updating operation is proposed to analyze the complexity. The experimental results on the public datasets show that the training time of the proposed algorithm is much lower than that of the comparison algorithms, and the retrieval accuracy is higher than that of the comparisons.

Key words: incremental learning, hash coding, semantic preservation, latent space, cross-modal retrieval

周坤, 徐黎明, 郑伯川, 谢亦才. 自适应高效深度跨模态增量哈希检索算法[J]. 计算机工程与应用, 2023, 59(2): 85-93.

ZHOU Kun, XU Liming, ZHENG Bochuan, XIE Yicai. Adaptively Efficient Deep Cross-Modal Hash Retrieval Based on Incremental Learning[J]. Computer Engineering and Applications, 2023, 59(2): 85-93.

参考文献

[1] 庾骏.跨模态哈希学习算法及其应用研究[D].无锡：江南大学，2020.
YU J.Study of cross-modal hashing algorithms with applications[D].Wuxi：Jiangnan University，2020.
[2] PENG Y，HUANG X，ZHAO Y.An overview of cross-media retrieval：concepts，methodologies，benchmarks and challenges[J].IEEE Transactions on Circuits and Systems for Video Technology，2018，28（9）：473-486.
[3] MA L，LI H，MENG F，et al.Global and local semantics-preserving based deep hashing for cross-modal retrieval[J].Neurocomputing，2018，312（1）：49-62.
[4] JIANG Q，LI W.Discrete latent factor model for cross-modal hashing[J].IEEE Transactions on Image Processing，2019，28（7）：3490-3501.
[5] LI N，LI C，DENG C，et al.Deep joint semantic embedding hashing[C]//International Joint Conference on Artificial Intelligence，2018：2397-2403.
[6] SHEN Y，LIU L，SHAO L，et al.Deep binaries：encoding semantic rich cues for efficient textual-visual cross retrieval[C]//IEEE International Conference on Computer Vision，2017：4117-4126.
[7] YANG E，DENG C，LI C，et al.Shared predictive cross-modal deep quantization[J].IEEE Transactions on Neural Networks and Learning Systems，2018，29（11）：5292-5303.
[8] 康培培，林泽航，杨振国，等.语义保持哈希在跨模态检索中的应用[J].计算机工程与应用，2022，58（21）：149-155.
KANG P P，LIN Z H，YANG Z G，et al.Semantic preserving hash for cross-modal retrieval[J].Computer Engineering and Applications，2022，58（21）：149-155.
[9] YANG E，DENG C，LIU W，et al.Pairwise relationship guided deep hashing for cross-modal retrieval[C]//AAAI Conference on Artificial Intelligence，2017：1618-1625.
[10] DENG C，CHEN Z，LIU X，et al.Triplet-based deep hashing network for cross-modal retrieval[J].IEEE Transactions on Image Processing，2019，27（8）：3893-3903.
[11] 吴吉祥，鲁芹，李伟霄.基于多模态注意力机制的跨模态哈希网络[J].计算机工程与应用，2022，58（20）：229-239.
WU J X，LU Q，LI W X.Cross-modal hashing network based on multimodal attention mechanism[J].Computer Engineering and Applications，2022，58（20）：229-239.
[12] SHEN X，SHEN F，SUN Q，et al.Semi-paired discrete hashing：learning latent hash codes for semi-paired cross-view retrieval[J].IEEE Transactions on Cybernetics，2017，47（12）：4275-4288.
[13] LI C，YAN T，LUO X，et al.Supervised robust discrete multimodal hashing for cross-media retrieval[J].IEEE Transactions on Multimedia，2019，21（11）：2863-2877.
[14] LI H，ZHANG C，JIA X，et al.Adaptive label correlation based asymmetric discrete hashing for cross-modal retrieval[J].IEEE Transactions on Knowledge and Data Engineering，2021，99：1.
[15] LI X，HU D，NIE F.Deep binary reconstruction for cross-modal hashing[C]//ACM International Conference on Multimedia，2017：1398-1406.
[16] JIANG Q，LI W.Deep cross-modal hashing[C]//IEEE Conference on Computer Vision and Pattern Recognition，2017：3232-3240.
[17] MA D.LIANG J，HE R，et al.Nonlinear discrete cross-modal hashing for visual-textual data[J].IEEE Multimedia，2017，24（2）：56-65.
[18] ZHONG F，CHEN Z，MIN G.Deep discrete cross-modal hashing for cross-media retrieval[J].Pattern Recognition，2018，83（1）：64-77.
[19] PEREIRA J，COVIELLO E，DOYLE G，et al.On the role of correlation and abstraction in cross-modal multimedia retrieval[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2016，36（3）：521-535.
[20] LI K，QI G，YE J，et al.Linear subspace ranking hashing for cross-modal retrieval[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2018，39（9）：1825-1838.
[21] SHEN X，SHEN F，LIU L，et al.Multi-view discrete hashing for scalable multimedia search[J].ACM Transactions on Intelligent Systems and Technology，2018，9（5）：1-21.

[22] ZHU H，LONG M，WANG J，et al.Deep hashing network for efficient similarity retrieval[C]//AAAI Conference on Artificial Intelligence，2016：2415-2421.

[23] XU L，ZENG X，LI W，et al.IDHashGAN：deep hashing with generative adversarial nets for incomplete data retrieval[J].IEEE Transactions on Multimedia，2022，24：534-545.