计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (24): 102-109.DOI: 10.3778/j.issn.1002-8331.1808-0265

• 模式识别与人工智能 • 上一篇    下一篇

结合PECGTFs和SSMC的腭裂语音咽擦音自动检测算法

付佳,田婷,唐铭,何凌,尹恒   

  1. 1.四川大学 电气信息学院,成都 610065
    2.四川大学 华西口腔医院,成都 610041
  • 出版日期:2019-12-15 发布日期:2019-12-11

Automatic Detection Algorithm for Cleft Palate Speech Pharyngeal Fricatives Combined with PECGTFs and SSMC

FU Jia, TIAN Ting, TANG Ming, HE Ling, YIN Heng   

  1. 1.College of Electrical Engineering and Information Technology, Sichuan University, Chengdu 610065, China
    2.West China Hospital of Stomatology, Sichuan University, Chengdu 610041, China
  • Online:2019-12-15 Published:2019-12-11

摘要: 咽擦音是腭裂语音中一种常见的代偿性构音异常,咽擦音的自动检测对腭咽功能的评估具有重要的临床意义。对腭裂语音咽擦音的自动检测算法进行了研究,提出分段指数压缩Gamatone滤波器组(Piecewise Exponent Compression Gammatone Filters,PECGTFs)和基于Softsign的多通道(Softsign-based Multi-Channel,SSMC)模型相结合提取语音信号的谱特征参数,采用KNN分类器,实现腭裂语音咽擦音的自动检测。实验共测试306个语音样本,并对比了使用不同的Gammatone滤波器、使用高斯差分(Difference of Gaussian,DoG)增强和SSMC模型增强对咽擦音自动检测结果的影响。实验结果表明,使用PECGTFs与SSMC相结合的算法对腭裂语音咽擦音的自动检测正确率达94.95%,对临床诊断具有一定的参考价值。

关键词: 腭裂语音, 咽擦音, 分段指数压缩, Gammatone滤波器组, SSMC模型

Abstract: Pharyngeal fricative in cleft palate speech is a common compensatory consonant anomaly. The automatic detection of pharyngeal fricatives has important clinical significance for the evaluation of pharyngeal function. In this work, an automatic detection algorithm in cleft palate speech has been researched. Piecewise Exponent Compression Gammatone Filters(PECGTFs) and Softsign-based Multi-Channel(SSMC) model are combined to extract the spectral feature parameters of speech signal, and KNN classifier is used to realize the automatic detection of pharyngeal fricatives in cleft palate speech. Experimental data includes 306 speech samples. The results of using different gammatone filters, Difference of Gaussian(DoG) enhancement and SSMC model enhancement on the automatic detection results of pharyngeal fricatives in cleft palate speech are compared. The automatic detection accuracy of pharyngeal fricatives in cleft palate speech using the algorithm proposed in this paper is 94.95%, which has certain reference value for clinical diagnosis.

Key words: cleft palate speech, pharyngeal fricatives, piecewise exponent compression, Gammatone filters, Softsign-based Multi-Channel(SSMC) model