Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (10): 193-199.DOI: 10.3778/j.issn.1002-8331.2108-0056

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

ABAFN:Aspect-Based Sentiment Analysis Model for Multimodal

LIU Lulu, YANG Yan, WANG Jie   

  1. School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, China
  • Online:2022-05-15 Published:2022-05-15



  1. 西南交通大学 计算机与人工智能学院,成都 611756

Abstract: With the continuous development of the Internet, user reviews for e-commerce products are increasing. It is of great significance to study the sentiment orientation of these user reviews for guiding the update iteration of products. In the past, the task of aspect-based sentiment analysis usually only involves the text modality. However, user-generated comment data generally include not only plain text, but also a large number of image and text data. For the kind of multimodal data including text and image, this paper proposes a novel aspect-based multimodal sentiment analysis model ABAFN(aspect-based attention and fusion network). Firstly, the model combines the pre-trained language model BERT and bi-directional long short-term memory network to obtain the textual representation of text and aspect. At the same time, the pre-trained ResNet is used to extract image features to generate visual representation. Then, textual representation and visual representation are weighted based on aspect using attention mechanism. Finally, the weighted representations of the two modalities are fused to perform the sentiment label classification task. Experiments on the Multi-ZOL dataset show that the performance of the ABAFN model exceeds the results in the currently known papers.

Key words: aspect-based multimodal sentiment analysis, BERT, bi-directional long short-term memory network, ResNet, attention mechanism

摘要: 随着互联网的不断发展,面向电商产品的用户评论日益增加。研究这些用户评论的情感导向,对于指导产品的更新迭代具有重要意义。以往的方面级情感分析任务通常只涉及文本模态,然而用户的评论数据一般不仅包括纯文本,还包括大量的图文数据。针对这种包括文本和图片的多模态数据,提出了一种新的方面级多模态情感分析模型ABAFN(aspect-based attention and fusion network)。模型结合预训练语言模型BERT和双向长短时记忆网络来获得文本和方面词的上下文表示,同时利用预训练残差网络ResNet提取图片特征生成视觉表示;利用注意力机制基于方面词对上下文表示和视觉表示进行加权;将两个模态加权后的表示级联融合执行情感标签分类任务。在Multi-ZOL数据集上的实验表明,ABAFN模型的性能超过了目前已知文献的结果。

关键词: 方面级多模态情感分析, BERT, 双向长短时记忆网络, ResNet, 注意力机制