计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (8): 207-210.

• 信号处理 • 上一篇    下一篇

内窥镜自动定位语音识别系统

马  宁,陈晓冬,李亚楠,尹青云,汪毅,郁道银   

  1. 天津大学 精密仪器与光电子工程学院,天津 300072
  • 出版日期:2014-04-15 发布日期:2014-05-30

Speech recognition for endoscopic automatic positioning system

MA Ning, CHEN Xiaodong, LI Yanan, YIN Qingyun, WANG Yi, YU Daoyin   

  1. College of Precision Instruments & Opto-electronics Engineering, Tianjin University, Tianjin 300072, China
  • Online:2014-04-15 Published:2014-05-30

摘要: 提出一种基于特定人的内窥镜自动定位语音识别系统,通过识别特定医生的语音控制口令实现内窥镜的定位,为手持内窥镜操作提供更加智能化的解决方案。在识别算法上提出了参考模板归一化平均的动态时间规划(Normalized Average-Dynamic Time Warping,NA-DTW)算法,可获得更高的识别率,系统以片上Windows?CE操作系统和ARM作为系统的软硬件平台。实验通过对10个不同测试人的共1 250组测试数据进行识别检测,NA-DTW算法与传统DTW算法相比,识别率从96.6%提高到99.76%,运算时间从469 ms缩短到241 ms。验证了NA-DTW算法可以完成基于特定人、孤立词的语音识别功能,并满足嵌入式系统中的实时检测条件。

关键词: 内窥镜, 动态时间规整, 参考模板, 特定人, 嵌入式系统

Abstract: A novel system for minimally invasive surgery is presented in this paper. The system utilizes an Endoscopic Automatic Positioner(EAP) controlled by speech recognition engine to implement the clamping and dynamical positioning of the laparoscope. The motion instructions of the EAP are transformed from voice commands of specific doctor recognized by speaker dependent speech recognition algorithm named Dynamic Time Warping(DTW). The DTW recognizes particular commands and rejects irrelevant items by enhancing the performance of the reference template. An ARM-core embedded platform is designed to run the DTW on Windows CE operating system. And on that basis, the performance of DTW is demonstrated by 1250 groups of experiments from 10 individual speakers. Compared with the traditional algorithm, the enhanced algorithm can improve the recognition rate by 3.16% and shorten the time of calculation by 51%. The results demonstrate the availability of the enhanced algorithm and its ability to satisfy the real time requirement in embedded system.

Key words: endoscopic, dynamic time warping, reference template, speaker dependent, embedded system