计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (10): 155-165.DOI: 10.3778/j.issn.1002-8331.2401-0274

• 模式识别与人工智能 • 上一篇    下一篇

专家路由的方面级多模态情感分析

赵京胜,王永政,杨心怡,曲维龙,朱巧明   

  1. 1.青岛理工大学 信息与控制工程学院,山东 青岛 266520
    2.苏州大学 计算机科学与技术学院,江苏 苏州 215006
  • 出版日期:2025-05-15 发布日期:2025-05-15

Aspect-Level Multimodal Sentiment Analysis Based on Experts Routing

ZHAO Jingsheng, WANG Yongzheng, YANG Xinyi, QU Weilong, ZHU Qiaoming   

  1. 1.School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong 266520, China
    2.School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China
  • Online:2025-05-15 Published:2025-05-15

摘要: 在方面级多模态情感分析领域,通过方面术语提取和方面级情感分类任务获取句子中的方面-情感对,前者提取句子中人物、商品等实体的方面词,后者根据给定的方面术语预测用户的情感极性。现有两种主流方法完成两个子任务,但存在各自的问题:(1)使用两个独立模型分别处理两个子任务,不同模型之间语义关联度较差,两个任务之间的底层特征无法得到延续;(2)使用一个模型同时处理两个子任务,两个任务共享一套模型参数,难以针对方面术语提取和方面级情感分类特点提升各任务性能,使提取方面-情感对的效率低。为解决上述问题,提出了专家路由的方面级多模态情感分析方法。在一个模型中针对性处理两个子任务,引入专家路由思想,采用稀疏-激活策略,即并非所有参数都会在处理每个输入时被激活,而是根据输入的特定任务需求,只有部分参数集合被调用处理各个任务。模型利用图像(文本)关键信息关注文本(图像)相关联的部分,形成视觉区域与包含情感信息方面词的初步的局部对应语义关联,通过门控单元获取模态间共享互补的深度混合语义矩阵。最后通过方面级情感分类模块进行情感预测。在两个公开数据集Twitter2015和Twitter2017上的实验结果表明该模型优于一系列基线模型。

关键词: 方面级多模态情感分析, 专家路由, 门控单元, 注意力机制

Abstract: In the field of aspect-level multimodal sentiment analysis, aspect-sentiment pairs in sentences are obtained through aspect-term extraction and aspect-level sentiment classification tasks. The former extracts aspects of entities such as characters and products in the sentence, while the latter predicts the user’s sentiment polarity based on given aspect-terms. There are currently two mainstream methods to complete two subtasks, but each has its own problems: (1) Using two independent models to handle two subtasks separately. The semantic correlation between different models is poor, and the underlying features between the two tasks cannot be continued. (2) Using one model to process two subtasks simultaneously. Two tasks share a set of model parameters, making it difficult to improve the performance of each task based on the characteristics of aspect-term extraction and aspect-level sentiment classification, resulting in low efficiency in extracting aspect-sentiment pairs. To address the aforementioned issues, aspect-level multimodal sentiment analysis based on experts routing is proposed. In one model, two subtasks are targeted. Expert routing idea is introduced and sparse activation strategy is adopted. Not all parameters will be activated when processing each input, but only a subset of parameter sets will be called to process each task based on the specific task requirements of the input. In addition, the model utilizes the key information of the image (text) to focus on the parts related to the text (image), forming a preliminary local corresponding semantic association between the visual region and the words containing sentimental information. The shared and complementary deep mixed semantic matrix between modalities is obtained through gating units. Finally, sentiment prediction is performed through the aspect-level sentiment classification module. The experimental results on two public datasets, Twitter2015 and Twitter2017, show that the model outperforms a series of baseline models.

Key words: aspect-level multimodal sentiment analysis, experts routing, gating units, attention mechanism