基于改进异步DBN模型的听视觉融合情感识别

计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (21): 162-165.

基于改进异步DBN模型的听视觉融合情感识别

张晓静1，2，蒋冬梅1，2，FAN Ping3，SAHLI Hichem3

1.西北工业大学计算机学院，西安 710072
2.陕西省语音与图像信息处理重点实验室，西安 710072
3.布鲁塞尔自由大学电子与信息系，比利时布鲁塞尔 1050

出版日期:2014-11-01 发布日期:2014-10-28

Audio visual emotion recognition based on modified asynchronous DBN models

ZHANG Xiaojing1，2, JIANG Dongmei1，2, FAN Ping3, SAHLI Hichem3

1.School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China
2.Shaanxi Provincial Key Laboratory on Speech and Image Information Processing, Xi’an 710072, China
3.Department of Electronics and Informatics, Vrije Universiteit Brussel, Brussel 1050, Belgium

Online:2014-11-01 Published:2014-10-28

摘要/Abstract

摘要： 提出了一个改进的三特征流听视觉融合异步动态贝叶斯网络情感模型（VVA_AsyDBN），采用面部几何特征（GF）和面部主动外观模型特征（AAM）作为两个视觉输入流，语音Mel倒谱特征（MFCC）作为听觉输入流，且视觉流的状态和听觉流的状态可以存在有约束的异步。在eNTERFACE’05听视觉情感数据库上进行了情感识别实验，并与传统的多流同步隐马尔可夫模型（MSHMM），以及具有两个听觉特征流（语音MFCC和局域韵律特征LP）和一个视觉特征流的听视觉异步DBN模型（T_AsyDBN）进行了比较。实验结果表明，VVA_AsyDBN获得了最高识别率75.61%，比视觉单流HMM提高了12.50%，比采用AAM、GF和MFCC特征的MSHMM提高了2.32%，比T_AsyDBN的最高识别率也提高了1.65%。

关键词: 听视觉融合, 动态贝叶斯网络, 主动外观模型（AAM）, 异步约束

Abstract: This paper proposes a modified triple stream asynchronous DBN model（VVA_AsyDBN）for audio visual emotion recognition, with the two visual feature streams, facial geometric features（GF） and facial active appearance model features（AAM）, synchronous at the state level, while they are asynchronous with the audio feature stream（Mel Filterbank Cepstrum Coefficients, MFCC） within controllable constraints. Emotion recognition experiments are carried out on the eNTERFACE’05 database, and results are compared with the traditional state synchronous Multi-Stream Hidden Markov Model（MSHMM）, as well as the asynchronous DBN model（T_AsyDBN） with two audio feature streams（MFCC and local prosodic features LP） and one visual feature stream. Results show that VVA_AsyDBN obtains the highest performance up to 75.61%, which is 12.50% higher than the visual only HMM, 2.32% higher than the MSHMM with MFCC, AAM and GF features, and 1.65% higher than the T_AsyDBN model with MFCC and LP features as well as AAM features.

Key words: audio visual fusion, Dynamic Bayesian Network（DBN）, Active Appearance Model（AAM）, asynchrony constraint

张晓静1，2，蒋冬梅1，2，FAN Ping3，SAHLI Hichem3. 基于改进异步DBN模型的听视觉融合情感识别[J]. 计算机工程与应用, 2014, 50(21): 162-165.

ZHANG Xiaojing1，2, JIANG Dongmei1，2, FAN Ping3, SAHLI Hichem3. Audio visual emotion recognition based on modified asynchronous DBN models[J]. Computer Engineering and Applications, 2014, 50(21): 162-165.

[1]	陈海洋，刘喜庆，环晓敏. 一步预测的SVDDBN缺失数据插补算法[J]. 计算机工程与应用, 2020, 56(7): 81-87.
[2]	刘飞飞，蔺婧娜，刘潇潇. 基于动态贝叶斯网络的复杂网络攻击方法研究[J]. 计算机工程与应用, 2017, 53(11): 18-25.
[3]	陈海洋，毛蕊蕊，聂弘颖. 单元化单隐变量变结构DDBN推理算法[J]. 计算机工程与应用, 2015, 51(17): 128-133.
[4]	吴陈鹤，杜友田，苏畅. 有限节点驱动的微博社会网络话题推荐方法[J]. 计算机工程与应用, 2013, 49(15): 141-146.
[5]	吴孟俊1，付钿1，刘建平2，牛玉刚1. 基于动态贝叶斯网络的多特征目标跟踪[J]. 计算机工程与应用, 2011, 47(30): 183-187.
[6]	冷翠平¹，王双成^1，2，王辉³. 动态贝叶斯网络结构学习的依赖分析方法研究[J]. 计算机工程与应用, 2011, 47(3): 51-53.
[7]	李维乾^1，2，解建仓¹，张永进³，薛保菊¹，张丽⁴. 动态贝叶斯网络在水文预报中的应用[J]. 计算机工程与应用, 2010, 46(6): 231-234.
[8]	张波. 基于模糊动态贝叶斯网络的辐射源威胁估计[J]. 计算机工程与应用, 2010, 46(12): 242-244.
[9]	王风娜,蒋冬梅,宋培岩. 结合发音特征的动态贝叶斯网络语音识别模型[J]. 计算机工程与应用, 2009, 45(8): 178-181.
[10]	吕国云¹,赵荣椿¹,蒋冬梅¹,SAHLI H². 基于上下文三音素DBN模型的连续语音识别[J]. 计算机工程与应用, 2007, 43(35): 35-38.
[11]	吕国云赵荣椿蒋冬梅蒋晓悦侯云舒 H.Sahli. 基于BTSM和DBN模型的唇读和视素切分研究[J]. 计算机工程与应用, 2007, 43(14): 21-24.