计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (33): 109-111.

• 数据库、信号与信息处理 • 上一篇    下一篇

半监督矢量量化的汉语方言辨识

顾明亮1,2,张 彪2   

  1. 1.徐州师范大学 物理与电子工程学院,江苏 徐州 221116
    2.江苏省语言科学与神经认知工程重点实验室,江苏 徐州 221116
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-11-21 发布日期:2011-11-21

Chinese dialect identification based on semi-supervised vector quantization

GU Mingliang1,2,ZHANG Biao2   

  1. 1.School of Physics and Electronic Engineering,Xuzhou Normal University,Xuzhou,Jiangsu 221116,China
    2.Jiangsu Key Lab of Language Science and Neural Cognitive,Xuzhou,Jiangsu 221116,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-11-21 Published:2011-11-21

摘要: 提出了一种新的方言码本模型辨识系统。该方法利用半监督的思想对方言语音数据进行矢量量化,形成具有监督信息的码本模型。有效解决了在汉语方言辨识中码本精度不高的问题,系统的识别率有了很大提高。实验结果表明带有监督信息的码本量化方法明显优于传统LBG矢量量化方法,对于汉语三种方言,辨识率可达94.23%,比传统码本辨识系统提高了近13%的正确辨识率。

关键词: 方言辨识, 半监督码本, 矢量量化

Abstract: This paper presents a noval code model in Chinese dialect identification.This method takes advantage of semi-supervised thought to quantitate speech data and forms a code model with supervision information.It effectively solves the problem of low precision code and improves system recognition rate.Experimental results prove that the method with supervision information is superior to traditional LBG quantitation method.For three Chinese dialect,the system can achieve a high accuracy of 94.23% and raise the rate of correct identification about 13% compared with traditional code system.

Key words: dialect identification, semi-supervised code, vector quantization