计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (8): 165-172.DOI: 10.3778/j.issn.1002-8331.2302-0088

• 模式识别与人工智能 • 上一篇    下一篇

双元双模态下二次门控融合的多模态情感分析

刘青文,买日旦·吾守尔,古兰拜尔·吐尔洪   

  1. 新疆大学 信息科学与工程学院,乌鲁木齐 830046
  • 出版日期:2024-04-15 发布日期:2024-04-15

Bi-Bi-Modality with Bi-Gated Fusion in Multimodal Sentiment Analysis

LIU Qingwen, Mairidan·Wushouer, Gulanbaier·Tuerhong   

  1. College of Information Seience and Engineering, Xinjiang University, Urumqi 830046, China
  • Online:2024-04-15 Published:2024-04-15

摘要: 为了平衡情感信息在不同模态中分布的不均匀性,获得更深层次的多模态情感表征,提出了一种基于双元双模态二次门控融合的多模态情感分析方法。对文本、视觉模态,文本、语音模态分别融合,充分考虑文本模态在三个模态中的优势地位。同时为了获得更深层次的多模态交互信息,使用二次融合。在第一次融合中,使用融合门决定向主模态添加多少补充模态的知识,得到两个双模态混合知识矩阵。在第二次融合中,考虑到两个双模态混合知识矩阵中存在冗余、重复的信息,使用选择门从中选择有效、精简的情感信息作为双模态融合后的知识。在公开数据集CMU-MOSEI上,情感二分类的准确率和F1值分别达到了86.2%、86.1%,表现出良好的健壮性和先进性。

关键词: 多模态情感分析, 双元双模态, 二次融合, 门控注意力机制

Abstract: In order to balance the uneven distribution of emotional information in different modalities and obtain a deeper multimodal emotional representation, this paper proposes a method called that bi-bi-modality with bi-gated fusion in multimodal sentiment analysis (BBBGF). In the process of fusing text-vision modality, text-audio modalities, the dominant position of the text modality among the three modalities is fully considered. At the same time, the dual fusion is used to obtain the multimodal emotional interaction information at the deeper level. In the first fusion, a fusion gate is used to decide how much knowledge of the supplement modality is added to the main modality, and getting two bi-modality hybrid knowledge matrices. In the second fusion, considering the redundant and repeated information in the two bi-modality mixed knowledge matrices, a selection gate is used to select effective and non-repeating emotional information as the final knowledge. On the public dataset CMU-MOSEI, the accuracy and F1 value of the sentiment binary classification reaches 86.2% and 86.1%, respectively, showing good robustness and advancement.

Key words: multimodal emotional analysis, bi-bi-modality, bi-gated fusion, gated-attention