Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (23): 136-145.DOI: 10.3778/j.issn.1002-8331.2308-0083

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

Target-Oriented Interaction Graph Neural Networks for Multimodal Aspect-Level Sentiment Analysis

ZHANG Lixia, WANG Kaixuan, PANG Zichao, LIANG Yun   

  1. College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
  • Online:2024-12-01 Published:2024-11-29

面向目标交互图神经网络的多模态方面级情感分析

张丽霞,汪凯旋,庞梓超,梁云   

  1. 华南农业大学 数学与信息学院,广州 510642

Abstract: For multimodal aspect level sentiment analysis task, in addition to extracting text and image representations, it is also necessary to combine them with aspect semantic information. However, the interaction between aspects and text and image information is not sufficiently processed by the previous relevant methods. Even if the global correlation between modes is established by using the attention mechanism, it is difficult to express their interaction in a fine granularity. In order to fully carry out fine-grained information interaction between multiple modes, a target-oriented interactive graph neural network is proposed, which is modeled around the relationship between text, image and aspect. Firstly, cross-attention is used to obtain the global representation of aspect-oriented text and image. Then a multimodal interaction graph is established to connect the local and global representation nodes of different modes. Finally, the graph attention network is used to fully integrate the features in the two granularities. Experiments on two benchmark datasets show that this model has better effect on emotion classification than the model using only attention mechanism.

Key words: multimodal aspect-level sentiment analysis, attention mechanism, cross-modal attention, target-oriented interaction, graph attention networks

摘要: 对于多模态方面级情感分析任务,除了需要提取出文本和图像的表示,还需要将它们与方面语义信息相结合处理。然而,以往的相关方法对方面与文本和图像信息之间的交互处理不够充分,即使使用注意力机制建立起模态全局之间的关联,也难以在细粒度表达出它们的交互。为了充分进行多模态之间细粒度上的信息交互,提出一种面向目标交互图神经网络,围绕文本、图像和方面三者的关系建模,采用交叉注意力获取面向方面目标的文本和图像全局表示;建立多模态交互图,以连接不同模态的局部及全局表示节点;使用图注意力网络在粗细两个粒度上充分融合特征。在两个基准数据集上进行实验,结果表明该模型相比于仅使用注意力机制的模型,具有更佳的情感分类效果。

关键词: 多模态方面级情感分析, 注意力机制, 交叉注意力, 面向目标交互, 图注意力网络