计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (20): 28-30.

• 学术探讨 • 上一篇    下一篇

视觉单通道唇读系统的有效性

陈 蓉,姚鸿勋,洪晓鹏,万玉奇   

  1. 哈尔滨工业大学 计算机科学与技术学院,哈尔滨 150001
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-07-11 发布日期:2007-07-11
  • 通讯作者: 陈 蓉

Effectivity from single visual channel lipreading system

CHEN Rong,YAO Hong-xun,HONG Xiao-peng,WAN Yu-qi   

  1. Harbin’s Institute of Technology,Harbin 150001,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-07-11 Published:2007-07-11
  • Contact: CHEN Rong

摘要: 在建立视觉单通道的大词汇量唇读系统中,提出了归一化的U-LDCT-KL两级唇读特征提取方法,即针对唇区分块的DCT(Discrete Cosine Transform)系数进行二级KL(Karhunen-Loeve Transform)去局域参数的交叠。此方法一方面提取了唇读的最有效的低级语义特征,另一方面更加合理地选择利用了特征的有效可区分性,使得用42维二级视觉特征,对特定人的唇动内容识别正确率达到77.8%。实验还证明了系统中分块的唇区DCT特征对的视觉单通道唇读系统是最有效的。

关键词: 唇读, Discrete Cosine Transform(DCT), Karhunen-Loeve Transform(KL)

Abstract: To build a large vocabulary lipreading system based on single visual channel,an unitary U-LDCT-KL two-level feature extraction method is presented in this paper.It is based on lip region partition DCT coefficients to be gotten rid off the overlap of those local coefficients by KL.This method,on one hand extractes the most efficient low features for lipreading,on the other hand,selectes features reasonably to improve their distinguishability.With 42-dimensional two-level visual features can get 77.8% rate of lip movement contents recognition for speaker-dependent cases.Experiments also prove that the features of blocks DCT coefficients in lip region are efficacious to visual single channel lipreading system.

Key words: lipreading, DCT, KL