Speaker tracking based on audio-video information fusion

Abstract

Abstract: In order to solve the defects of tracking using only audio and video information, a novel speaker tracking algorithm based on audio-video information fusion using importance particle filter is proposed. The proposed algorithm performs in a closed-loop tracking system where five modules that are bottom tracking, fusion center, importance particle filtering, tracking results output and results feedback work together to make the system best. At the bottom tracking module, based on the complementarity between speech and image of a speaker, both mean shift tracking based on face color information and sound source localization using time delay of arrival from microphone array are adopted to acquire tracking information, and they are integrated in the fusion center to obtain audio-video fused importance function and fused likelihood model. Then the fused data are processed by importance particle filter to output the tracking results, and the results are returned dynamically to the skin color tracking module and sound source localization module. Such a closed-loop system ensures the proposed algorithm performs in real-time. Experiments using AMI Meeting Corpus data demonstrate that the proposed approach is more better than those trackers utilizing only audio or video information at robustness and accuracy, and reaches an average tracking error of 9.32%.

Key words: object tracking, sound source localization, skin color tracking, mean shift, importance particle filter

摘要： 针对单独的音频和视频信息跟踪的缺陷，提出了一种音视频信息融合的粒子滤波跟踪算法。采用闭环跟踪框架，分为底层跟踪、融合、重要性粒子滤波、跟踪输出和反馈五个环节。底层跟踪环节利用说话人脸部肤色信息进行均值漂移跟踪的同时，利用说话人声音信号到达麦克风阵列的时间延迟进行跟踪定位；融合环节对这两者得到的跟踪信息进行整合，得出基于音视频信息融合的重要性函数和融合似然模型；滤波环节利用重要性粒子滤波算法对融合的数据进行滤波处理；跟踪环节根据滤波结果对说话人进行跟踪；反馈环节将跟踪结果动态反馈给人脸肤色跟踪和声源定位跟踪模块。流程化的闭环处理过程保证了算法的实时性。最后，采用AMI会议语料库对该算法进行测试，结果表明该算法平均误跟率仅为9.32%，比使用单一音频或视频信息的跟踪算法稳定性好、准确性高。

关键词: 对象跟踪, 声源定位, 肤色跟踪, 均值漂移, 重要性粒子滤波

CAO Jie, ZHENG Jingrun. Speaker tracking based on audio-video information fusion[J]. Computer Engineering and Applications, 2012, 48(13): 118-124.

曹洁，郑景润. 音视频信息融合的说话人跟踪算法研究[J]. 计算机工程与应用, 2012, 48(13): 118-124.

[1]	MA Jun，WANG Yuhao. Object Tracking Algorithm Based on Adaptive Update Strategy and Re-detection Technology [J]. Computer Engineering and Applications, 2021, 57(9): 217-224.
[2]	LI Zhenxiao, SUN Wei, LIU Mingming, ZHENG Lili, CHEN Shaoying. Research on Vehicle Detection and Tracking Algorithms in Traffic Monitoring Scenes [J]. Computer Engineering and Applications, 2021, 57(8): 103-111.
[3]	WANG Ling, WANG Jiapei, WANG Peng, SUN Shuangzi. Siamese Network Tracking Algorithms for Hierarchical Fusion of Attention Mechanism [J]. Computer Engineering and Applications, 2021, 57(8): 169-174.
[4]	CUI Zhoujuan, AN Junshe, CHEN Changlong, CUI Tianshu. Deep Convolutional Features Heterogeneous Tracking System Based on PYNQ Framework [J]. Computer Engineering and Applications, 2021, 57(4): 120-126.
[5]	LIN Shubin, WU Guishan, XU Jiayun, YANG Wenyuan. Multi-frame Surveillance of Correlation Filter in UAV Object Tracking [J]. Computer Engineering and Applications, 2021, 57(24): 152-160.
[6]	WU Zhewei, ZHOU Shijie, LIU Qihe. Review on Object Tracking Methods for Restricted Computing Resources [J]. Computer Engineering and Applications, 2021, 57(21): 24-40.
[7]	ZHANG Yao, LU Huanzhang, ZHANG Luping, HU Moufa. Overview of Visual Multi-object Tracking Algorithms with Deep Learning [J]. Computer Engineering and Applications, 2021, 57(13): 55-66.
[8]	LIU Yun, QIAN Meiyi, LI Hui, WANG Chuanxu. Efficient Object Tracking with Feature Fusion and Training Acceleration [J]. Computer Engineering and Applications, 2021, 57(10): 101-109.
[9]	SHAN Yugang, HU Weiguo. Review of Visual Object Tracking Algorithms of Adaptive Direction and Scale [J]. Computer Engineering and Applications, 2020, 56(9): 13-23.
[10]	SUN Xinling, ZHANG Hao, ZHAO Li. Video Object Tracking Based on Scale Estimation MST and Particle Filtering [J]. Computer Engineering and Applications, 2020, 56(8): 117-123.
[11]	WANG Sikui, LIU Yunpeng, QI Lin, ZHANG Zhongyu, LIN Zhiyuan. Object Tracking Method Based on Background Constraints and Convolutional Features [J]. Computer Engineering and Applications, 2020, 56(8): 205-214.
[12]	ZOU Chengming, MING Chenglong, LI Chenglong. Fusion Correlation Particle Filter Object Tracking Algorithm [J]. Computer Engineering and Applications, 2020, 56(7): 184-192.
[13]	YANG Jialin, WANG Wenwei, XIONG Xiaoxuan, HE Shiying. Fast Scale Estimation Based on Enhanced Multi-kernel Correlation Filter Algorithm [J]. Computer Engineering and Applications, 2020, 56(7): 210-220.
[14]	FENG Xuegang, ZHOU Dake, YANG Xin. Margin-Maximized Correlation-Filter Based Tracking Method Advanced by Mask Mechanism [J]. Computer Engineering and Applications, 2020, 56(6): 153-158.
[15]	LI Junyan, SONG Huansheng, ZHANG Zhaoyang, HOU Jingyan, WU Feifan. Multi-Object Vehicle Tracking and Trajectory Optimization Based on Video [J]. Computer Engineering and Applications, 2020, 56(5): 194-199.

Speaker tracking based on audio-video information fusion

音视频信息融合的说话人跟踪算法研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics