计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (2): 137-146.DOI: 10.3778/j.issn.1002-8331.2207-0456

• 模式识别与人工智能 • 上一篇    下一篇

利用信息瓶颈的多模态情感分析

程子晨,李彦,葛江炜,纠梦菲,张敬伟   

  1. 1.天津师范大学 电子与通信工程学院,天津 300387
    2.天津市无线移动通信与无线电能传输重点实验室,天津 300387
  • 出版日期:2024-01-15 发布日期:2024-01-15

Multimodal Sentiment Analysis Based on Information Bottleneck

CHENG Zichen, LI Yan, GE Jiangwei, JIU Mengfei, ZHANG Jingwei   

  1. 1.College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin 300387, China
    2.Tianjin Key Laboratory of Wireless Mobile Communications and Power Transmission, Tianjin 300387, China
  • Online:2024-01-15 Published:2024-01-15

摘要: 在多模态情感分析领域,之前的研究主要集中在如何针对不同模态的信息进行交互融合。然而基于各种复杂的融合策略会使得生成的多模态表示向量不可避免地携带大量与下游任务无关的噪声信息,这会导致较高的过拟合风险,并且影响高质量预测结果的生成。为解决上述问题,根据信息瓶颈理论,设计了包含两个互信息估计器的互信息估计模块,旨在优化多模态表示向量与真实标签之间的互信息下界,同时最小化多模态表示向量与输入数据之间的互信息,以达到寻找一种简洁的、具有较好预测能力的多模态表示向量。利用MOSI、MOSEI和CH-SIMS数据集进行对比实验,结果表明提出的方法是富有成效的。

关键词: 多模态情感分析, 信息瓶颈理论, 互信息估计

Abstract: In the field of multimodal sentiment analysis, previous research mainly focused on how to interactively fuse information from different modalities. However, based on various complex fusion strategies, the generated multimodal representation vector inevitably carries a lot of noise information irrelevant to downstream tasks, which leads to a high risk of overfitting and affects the generation of high-quality prediction results. In order to solve the above problems, according to the information bottleneck theory, this paper designs a mutual information estimation module containing two mutual information estimators, aiming to optimize the lower bound of mutual information between the multimodal representation vector and the true label, while minimizing the multimodality. The mutual information between the representation vector and the input data is used to find a concise multimodal representation vector with better predictive ability. Using MOSI and MOSEI and CH-SMIS datasets to conduct comparative experiments, the results show that the method proposed in this paper is effective.

Key words: multimodal sentiment analysis, information bottleneck, mutual information estimating