Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (21): 123-130.DOI: 10.3778/j.issn.1002-8331.2003-0175

Previous Articles     Next Articles

Ambiguous Ranking Perceptual Hashing

WANG Zhen, SUN Fuzhen, ZHANG Longbo, WANG Lei   

  1. School of Computer Science and Technology, Shandong University of Technology, Zibo, Shandong 255000, China
  • Online:2020-11-01 Published:2020-11-03

模糊序列感知哈希

王振,孙福振,张龙波,王雷   

  1. 山东理工大学 计算机科学与技术学院,山东 淄博 255000

Abstract:

For traditional algorithms, the ambiguous ranking orders often cause an inferior ANN search performance. To solve this problem, the Ambiguous Ranking Perceptual Hashing(ARPH) is proposed. During the training process, the rule for distinguishing ambiguous ranking is defined in advance, and the hashing functions which satisfy the specify rule are learnt. As a result, the ranking orders of the data points which share the same Hamming distance to a query sample can be effectively distinguished without computing the bitwise weight and weighted Hamming distances. The objective function based on the ANN search performance evaluation metric of average precision is defined, and it belongs to the ordinal relation preserving restriction. Thus, the original Euclidean relative similarity relationship can be well preserved in the Hamming space. During the training process, the binary code, Hamming distances and judgment values are relaxed to continuous formulation. Finally, the gradient descent algorithm can be used to learn hashing functions. The ANN search comparative experiments on three image datasets show that the proposed ambiguous ranking perceptual hashing can achieve the best ANN search performance.

Key words: ordinal relation preserving, first position decision, hashing algorithm, approximate nearest neighbors search

摘要:

为了解决传统哈希算法在图像近邻检索任务中的模糊排序问题,提出了模糊序列感知哈希,旨在学习满足首位区分规则的哈希函数,其可直接利用二值编码本身信息区分模糊序列,从而在近邻检索中无需额外计算比特位权值和加权汉明距离,能以较小的代价区分与查询样本具有相同汉明距离的数据点之间的序列。建立了类似于近邻检索性能评价指标平均准确率的目标函数,其属于序列保持约束条件,能够保证数据点对在汉明空间与欧式空间内具有相同的相对相似性,可确保所提算法适应于近邻检索任务。在训练过程中,对二值编码、汉明距离以及判断函数进行了连续化松弛处理,从而可直接采用批量梯度下降算法优化目标函数,降低了训练复杂度。在三种图像数据集上的对比实验证明,模糊序列感知哈希的近邻检索性能较优。

关键词: 序列保持, 首位区分, 哈希算法, 近邻检索