Computer Engineering and Applications ›› 2018, Vol. 54 ›› Issue (9): 1-4.DOI: 10.3778/j.issn.1002-8331.1711-0361

Previous Articles     Next Articles

Analysis of importance of audio signals in multimedia emotion tagging

CHEN Mo, GUO Lei   

  1. School of Automation, Northwestern Polytechincal University, Xi’an 710072, China
  • Online:2018-05-01 Published:2018-05-15


陈  墨,郭  雷   

  1. 西北工业大学 自动化学院,西安 710072

Abstract: Emotion tagging is one field of interest in affective computing. Several works are published on this topic focusing on emotion tagging for images, audios and multimedia clips. In this paper the importance of audio signals are analyzed under previous proposed EEG-based brain encoding emotion tagging framework. The open accessed affective computing dataset DEAP is employed as the benchmark. For the analysis, three kinds of visual features and one set of audio features are extracted from video clips. The visual features are used for emotion tagging under the proposed framework at first then the combination of audio and visual features are used through the same procedure. The results indicate that emotion tagging accuracies are improved by combining audio and visual features compared with accuracy using only visual features. Moreover, no performance loss is caused by the increasement of feature dimensions.

Key words: affective computing, emotion tagging, electroencephalogram(EEG), multimodal fusion

摘要: 情感标签标注是情感计算中的一个重要领域。该领域中针对音频、图像和多媒体内容的情感标签标注已有多个相关工作发表。为分析某个基于脑电图的大脑编码的多媒体情感标签标注中音频信号的重要性,情感计算公开数据库DEAP被用作测试基准。基于DEAP数据库的多媒体刺激,共提取了音频特征和三类视频特征。首先仅使用视频特征基于该框架进行多媒体标签标注任务,之后联合使用音频和视频特征进行同样的工作。实验结果表明,与仅使用视频特征的结果相比,联合使用音视频特征可以提高标注准确率,并且没有因为增加特征维数造成性能损失。

关键词: 情感计算, 情感标签标注, 脑电图, 多模态融合