计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (15): 199-208.DOI: 10.3778/j.issn.1002-8331.2404-0424

• 模式识别与人工智能 • 上一篇    下一篇

融合反事实推理的多模态情感分析算法研究

王淑娴,杨海,冯程,李雪   

  1. 山东交通学院 信息科学与电气工程学院,济南  250357
  • 出版日期:2025-08-01 发布日期:2025-07-31

Multimodal Sentiment Analysis Based on Counterfactual Reasoning

WANG Shuxian, YANG Hai, FENG Cheng, LI Xue   

  1. College of Information Science and Electrical Engineering, Shandong Jiaotong University, Jinan 250357, China
  • Online:2025-08-01 Published:2025-07-31

摘要: 在多模态情感分析中,现有的情感识别模型大多在测试集上准确率高,但在实际应用中,模型情感识别准确率比测试集中低,即模型的泛化能力不强。许多研究表明,这种现象归因于:多模态情感分析模型在训练阶段严重依赖于文本模态,进而学习到了文本模态和情感标签之间关于社会、种族、性别等刻板印象,也称为文本偏见。提出一种基于反事实推理的多模态情感分析模型(counterfactual reasoning for multimodal sentiment analysis,CRFM),构建因果图并分析情感识别结果的因果效应,利用反事实推理从情感识别结果总效应中去除文本偏见直接效应,从而去除文本偏见不良影响,提高模型情感识别准确率,进而增强模型泛化能力。利用MOSI和MOSEI两种数据集,分别与六种基线模型进行对比实验,CRFM准确率达87.05%,优于其他基线模型。此外,CRFM在两种数据集中准确率更稳定,模型泛化能力更强。

关键词: 多模态情感分析, 反事实推理, 因果效应, 因果图

Abstract: In multimodal sentiment analysis, existing sentiment recognition models often achieve high accuracy on the test set, but their performance in real-world applications is lower, indicating weak generalization ability. Many studies have attributed this phenomenon to the heavy reliance of multimodal sentiment analysis models on the textual during the training phase, resulting in the learning of stereotypical biases related to social, racial, and gender factors, known as textual biases. A counterfactual reasoning for multimodal sentiment analysis (CRFM) model is proposed, constructing a causal diagram and analyziing the causal effects of sentiment recognition results. Counterfactual reasoning is utilized to remove the direct effects of textual biases from the total effect of sentiment recognition results, thereby removing the adverse effects of textual biases, improving the accuracy of sentiment recognition, and enhancing the model’s generalization ability. Comparative experiments are conducted with six baseline models in both MOSI and MOSEI datasets. The CRFM accuracy reaches 87.05%, which is better than other baseline models. In addition, CRFM has a more stable accuracy and stronger model generalization ability in both datasets.

Key words: multimodal sentiment analysis, counterfactual reasoning, causal effects, causal diagram