计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (18): 146-148.

• 数据库、信号与信息处理 • 上一篇    下一篇

梯度方向直方图在语谱图映射中的应用

陈雁翔1,刘 鸣2   

  1. 1.合肥工业大学 计算机与信息学院,合肥 230009
    2.伊利诺伊大学 香槟分校 电子计算机工程系,美国 伊利诺伊州 61801
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-06-21 发布日期:2011-06-21

Application of gradient orientation histogram in correspondence between spectrograms

CHEN Yanxiang1,LIU Ming2   

  1. 1.College of Computer Science & Information,Hefei University of Technology,Hefei 230009,China
    2.Department of Electrical & Computer Engineering,University of Illinois at Urbana-Champaign,Urbana,Illinois 61801,USA
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-06-21 Published:2011-06-21

摘要: 语谱图是显示时变频谱幅度特征的图形,基于梯度方向直方图建立语谱图之间的映射关系,找出它们对应的频率结构,为说话人规整及进一步的语音处理提供了途径。在提取特征参数之前,用梯度方向直方图描述语谱图中点的特征,进而实现两个说话人的语谱图在频率轴上的非线性映射,其实质是在频率点相似性的条件下,运用动态规划准则的最佳匹配问题。在TIDIGITS数据库上的实验表明,该方法在训练集与测试集不匹配时能明显降低系统的误识率。

关键词: 梯度方向直方图, 语谱图映射, 说话人规整, 动态规划

Abstract: Spectrogram is an image reflecting time-varying spectral magnitude.The correspondence between spectrograms is established based on Gradient Orientation Histogram(GOH)to find the corresponding frequency structures,which benefits speaker normalization and further speech processing.Before extraction of feature parameters,the local feature in a spectrogram is described and the non-linear correspondence on the frequency axes between spectrograms of two speakers is established.In fact,the method is to find the optimal match by using dynamic programming given the similarity measure of two frequency bins.The experiments on the TIDIGITS corpus show reduction on the error rate under mismatched condition of training and testing data.

Key words: Gradient Orientation Histogram(GOH), spectrogram correspondence, speaker normalization, dynamic programming