计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (21): 146-151.

• 模式识别与人工智能 • 上一篇    下一篇

基于关键词识别的语音到手势的转换

赵  娜,杨鸿武   

  1. 西北师范大学 物理与电子工程学院,兰州 730070
  • 出版日期:2016-11-01 发布日期:2016-11-17

Realizing speech to gesture conversion by keyword spotting

ZHAO Na, YANG Hongwu   

  1. College of Physics and Electronic Engineering, Northwest Normal University, Lanzhou 730070, China
  • Online:2016-11-01 Published:2016-11-17

摘要: This paper proposes a method to realize a speech-to-gesture conversion for communication between speech impediments and healthy people. The keyword spotting is employed to recognize the key words from input speech signals. At the same time, the three dimensional gesture models of keywords are built by 3D modeling technology according to the “Chinese sign language”. The speech-to-gesture conversion is finally realized by playing the corresponding 3D gestures with OpenGL from the results of keyword spotting. Tests show that the realized keyword spotting achieves 90.1% of average recognition rate on letters and numbers. The converted gestures obtain 4.4 of mean opinion score. Therefore the proposed method can be applied to the communications between normal persons and speech impediments.

关键词: keyword spotting, gesture modeling, speech to gesture conversion

Abstract: This paper proposes a method to realize a speech-to-gesture conversion for communication between speech impediments and healthy people. The keyword spotting is employed to recognize the key words from input speech signals. At the same time, the three dimensional gesture models of keywords are built by 3D modeling technology according to the “Chinese sign language”. The speech-to-gesture conversion is finally realized by playing the corresponding 3D gestures with OpenGL from the results of keyword spotting. Tests show that the realized keyword spotting achieves 90.1% of average recognition rate on letters and numbers. The converted gestures obtain 4.4 of mean opinion score. Therefore the proposed method can be applied to the communications between normal persons and speech impediments.

Key words: keyword spotting, gesture modeling, speech to gesture conversion