计算机工程与应用 ›› 2017, Vol. 53 ›› Issue (17): 160-165.DOI: 10.3778/j.issn.1002-8331.1610-0315

• 模式识别与人工智能 • 上一篇    下一篇

全局信息融合的汉语方言自动辨识

邱远航1,顾明亮1,马  勇1,金  赟1,韩  军1,赵冬梅1,赵呈昊2   

  1. 1.江苏师范大学 物理与电子工程学院,江苏 徐州 221116
    2.江苏师范大学 电气工程及自动化学院,江苏 徐州 221116
  • 出版日期:2017-09-01 发布日期:2017-09-12

Automatic identification of Chinese dialects based on global information fusion

QIU Yuanhang1, GU Mingliang1, MA Yong1, JIN Yun1, HAN Jun1, ZHAO Dongmei1, ZHAO Chenghao2   

  1. 1. School of Physics and Electronic Engineering, Jiangsu Normal University, Xuzhou, Jiangsu 221116, China
    2. School of Electrical Engineering & Automation, Jiangsu Normal University, Xuzhou, Jiangsu 221116, China
  • Online:2017-09-01 Published:2017-09-12

摘要: 提出身份认证矢量(Identity vector,I-vector)结合韵律信息的汉语方言辨识方法。全差异空间替代本征音与本征信道空间,将高维超矢量映射为低维I-vector表示,并进行信道补偿与特征降维处理。汉语是有调语言,各方言在其韵律结构上具有明显差异,I-vector特征融合全局韵律信息,可有效增补各方言鉴别性。利用融合信息对闽、粤、吴等五种方言以及普通话进行辨识实验,等错率(Equal Error Rate,EER)达到8.01%,比高斯混合模型-通用背景模型(Gaussian Mixture Model-Universal Background Model,GMM-UBM)降低56.2%,表明融合全局韵律信息的I-vector方法可有效提高汉语方言辨识正确率。

关键词: 汉语方言辨识, 韵律特征, I-vector, 特征融合

Abstract: A new method of Chinese dialects identification based on Identity vector(I-vector) combined with prosodic information is proposed. The high-dimensional super-vector is mapped to a low-dimensional I-vector representation by Total Variability(TV) model. Channel compensation and feature dimension reduction are also performed. Chinese is a typical language with a tone and Chinese dialects have obvious differences among rhythm, stress and other rhythmic structure. The serial fusion of I-vectors with global prosodic information can improve the distinguishability of Chinese dialects effectively. The Equal Error Rate(EER) using fusion strategy of five Chinese dialects and Mandarin is 8.01%, which is 56.2% lower than the Gaussian Mixture Model-Universal Background Model(GMM-UBM) method. The experimental results show that the I-vector method fusing global prosodic information can improve the Chinese dialects identification accuracy effectively.

Key words: Chinese dialects identification, prosodic features, I-vector, features fusion