Robust audio fingerprint using lifting-based wavelet and non-negative matrix factorization

Abstract

Abstract: Aiming at Content-Based Audio Retrieval（CBAR） applications, a robust audio fingerprint method using lifting-based wavelet transform and non-negative matrix factorization is proposed. The original audio is divided to small frames by fixed length and then low-frequent and high-frequent components coefficients are obtained by lifting-based wavelet transformation in every frame. The every audio frame is approximately represented as a product of a base matrix and a coefficient matrix using Non-negative Matrix Factorization（NMF）. The sum of each column in the coefficient matrix is calculated, which is then quantized to produce 1 bit of the fingerprint sequence. Experiment results show that the proposed scheme is robust against common audio processing. It is insensitive to small local change, and therefore is suitable for distinguishing different audios.

Key words: audio fingerprint, audio digest, non-negative matrix factorization, lifting-based wavelet, audio retrieval

摘要： 针对音频检索应用，提出一种使用提升小波变换和非负矩阵分解的稳健音频指纹方案。原始音频按固定长度分帧，对每帧进行小波提升变换得到低频近似分量和高频细节分量；对细节分量作非负矩阵分解得到可近似表示音频子帧的基矩阵和系数矩阵；将系数矩阵各列元素累加，对各列累加和进行量化得到表示分帧音频指纹序列的1 bit信息。实验结果表明该方案对常见音频处理操作具有良好的稳健性，对音频的局部变化不敏感，能较好地区分不同音频，可用于面向对象的音频检索。

关键词: 音频指纹, 音频摘要, 非负矩阵分解, 提升小波, 音频检索

LONG Xiaobao. Robust audio fingerprint using lifting-based wavelet and non-negative matrix factorization[J]. Computer Engineering and Applications, 2013, 49(9): 197-199.

龙小保. 使用提升小波和非负矩阵分解的稳健音频指纹[J]. 计算机工程与应用, 2013, 49(9): 197-199.

[1]	YAO Shanshan, NIU Baoning. Sampling and Counting Audio Retrieval Method Resistant to Pitch-Shift [J]. Computer Engineering and Applications, 2021, 57(12): 126-131.
[2]	LIU Jiaji, BAO Chongming, ZHOU Lihua, WANG Chongyun, KONG Bing. Community Detecting Method Based on Non-negative Matrix Factorization with Graph Regular Term in Heterogeneous Information Networks [J]. Computer Engineering and Applications, 2020, 56(21): 131-138.
[3]	SUI Xiuwu, NIU Jiabao, LI Haotian, QIAO Mingmin. Upper Limb sEMG Gesture Recognition Method Based on NMF-SVM Model [J]. Computer Engineering and Applications, 2020, 56(17): 161-166.
[4]	CAI Fei, ZHANG Xin, MU Xiaohui, CHEN Jie, CAI Xun. Research on Link Prediction Method Based on Deep Non-negative Matrix Factorization [J]. Computer Engineering and Applications, 2020, 56(15): 153-161.
[5]	GUO Weiting, XIA Limin. Human activity recognition based on multi-view nonnegative matrix factorization [J]. Computer Engineering and Applications, 2018, 54(16): 37-43.
[6]	YANG Qiang, YANG You. Recommender system algorithm combing personal compact and trust propagation [J]. Computer Engineering and Applications, 2017, 53(15): 36-40.
[7]	YANG Chunming1, ZHANG Hui1, HE Tianxiang1, LI Bo1，2, ZHAO Xujian1. Approach to building for Chinese polarity lexicons with co-occurrence relation [J]. Computer Engineering and Applications, 2016, 52(9): 164-169.
[8]	WANG Xiaohua, SUN Xiaojiao, LI Yue. Face recognition based on weighted wavelet decomposition and manifold regularized non-negative matrix factorization [J]. Computer Engineering and Applications, 2016, 52(7): 150-154.
[9]	HUA Bin, ZHANG Lichao, ZHAO Fuqiang. Audio retrieval based on weighted MFCC [J]. Computer Engineering and Applications, 2015, 51(8): 200-204.
[10]	ZHANG Su’e, ZHOU Jun, WANG Dawei, MEI Hongyan. Face recognition based on Gabor and NMF [J]. Computer Engineering and Applications, 2015, 51(3): 143-147.
[11]	LI Guang, XI Meng. Improved privacy-preserving classification method using non-negative matrix factorization [J]. Computer Engineering and Applications, 2015, 51(21): 1-5.
[12]	LIU Yufeng, YAO Jiafei. Acoustic feedback control using single channel blind source separation [J]. Computer Engineering and Applications, 2015, 51(10): 211-214.
[13]	WANG Fei1, LIANG Xiaogeng1，2, CUI Yankai1, WU Xiaojun3. Image fusion combined with NMF and new contourlet transform [J]. Computer Engineering and Applications, 2013, 49(5): 150-153.
[14]	YUAN Baohua1, WANG Huan2, REN Mingwu2. Fusing local binary pattern and LNMF of face recognition [J]. Computer Engineering and Applications, 2013, 49(5): 166-169.
[15]	LUAN Jiayu, WANG Hairui, BI Guihong, WANG Xi, CHEN Shilong. Non-negative matrix factorization method identify power quality disturbance signal [J]. Computer Engineering and Applications, 2013, 49(4): 240-244.

Robust audio fingerprint using lifting-based wavelet and non-negative matrix factorization

使用提升小波和非负矩阵分解的稳健音频指纹

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics