Computer Engineering and Applications ›› 2013, Vol. 49 ›› Issue (1): 163-166.

Previous Articles     Next Articles

Design of speaker location and tracking system based on DSP

CAO Jie1, HE Yixi2   

  1. 1.School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China
    2.Department of Tele-communication, School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China
  • Online:2013-01-01 Published:2013-01-16

基于DSP的说话人定位跟踪系统的设计

曹  洁1,何裔玺2   

  1. 1.兰州理工大学 计算机与通信学院,兰州 730050
    2.兰州理工大学 计算机与通信学院,通信工程系,兰州 730050

Abstract: Aiming at the problem of inaccurate location and tracking for speaker in a meeting room, a method of location and tracking of audio visual fusion based on Digital Signal Processing(DSP) is proposed. Kalman filter and Mean-shift algorithm are used to seek optimal situation of speaker for visual location and tracking. Meanwhile, it uses Time Difference of Arrival to locate the target. Then Kalman information centre made audio and visual fused in order to advance stability of audio and visual system. The experimental results show that the processing for 320 pixels×240 pixels image achieves 20 frame/s, and the proposed method can rise target’s location and tracking precision of 17%, compared with single mode system, and improve the stability.

Key words: information fusion, audio location, target tracking, Kalman filter

摘要: 针对室内说话人实时定位跟踪不准确的问题,提出了一种基于TMS320DM6437硬件平台的音视频融合定位跟踪方法。该方法利用Kalman滤波器和Mean-shift算法搜寻说话人最优位置进行视频定位跟踪。同时,采用到达时间差的音频方法进行目标位置估计。由Kalman信息整合中心进行音视频融合,以提高视听系统定位跟踪的稳定性。实验结果表明,与单模态定位跟踪系统相比,该方法对320×240的图像可实现平均20 frame/s的跟踪速度,能提高目标定位跟踪准确度17%,改进效果明显且稳定。

关键词: 信息整合, 声源定位, 目标跟踪, Kalman滤波器