Computer Engineering and Applications ›› 2013, Vol. 49 ›› Issue (24): 205-209.

Previous Articles     Next Articles

Method for multiple speech source localization based on sub-band steered response power

NI Zhilian, CAI Weiping, ZHANG Yidian   

  1. School of Electrical Engineering, Jiujiang Vocational and Technical College, Jiujiang, Jiangxi 332007, China
  • Online:2013-12-15 Published:2013-12-11

基于子带可控响应功率的多声源定位方法

倪志莲,蔡卫平,张怡典   

  1. 九江职业技术学院 电气工程学院,江西 九江 332007

Abstract: To improve localization performance of microphone array in the case of multiple speakers, a method for multiple speech source localization based on sub-band steered response power is presented. In this method, speech signal is divided into seven sub-bands in frequency domain, and the steered response power-phase transform functions are computed in each sub-band. Then initial estimations of source location are generated by searching the maximum value for each function in the source space. According to the frequency sparsity characteristic for speech signal, these initial estimations include multiple source locations. The final source location estimations are produced from them using agglomerative clustering. Simulation and experiment results show that the proposed algorithm facilitates about 4% increase in localization correct rate and about 7% reduction in localization extra rate compared with the conventional algorithm under the conditions of two speakers, 10 dB signal-to-noise ratio and moderate reverberation.

Key words: microphone array, multiple speech source localization, sub-band steered response power, clustering

摘要: 为了提高多个说话人情况下麦克风阵列的定位性能,提出基于子带可控响应功率的多声源定位算法。该算法将语音信号频域分为7个子带,在每个子带计算相位变换加权的可控响应功率函数,在声源空间搜索其最大值得到声源位置的初始估计。根据语音信号频率的稀疏性,这些初始估计包含多个声源的位置,运用会聚聚类算法得到最终的声源位置估计。仿真和实验表明,在有2个说话人,10 dB信噪比,较强混响的条件下,该算法比传统算法的定位正确率提高了约4%,额外率降低了约7%。

关键词: 麦克风阵列, 多声源定位, 子带可控响应功率, 聚类