Multiview Interaction Learning Network for Multimodal Aspect-Level Sentiment Analysis

doi:10.3778/j.issn.1002-8331.2210-0288

Abstract

Abstract: Previous multimodal aspect-level sentiment analysis methods only use the general text and picture representations of the pre-trained model, which are insensitive to recognition of aspect and opinion word correlation, and the contribution of picture information to word representation cannot be obtained dynamically, so they cannot fully recognize the correlation between multimodal and aspects. Aiming at the above problems, a multiview interaction learning network is proposed. In order to make full use of the global features of the text in multimodal interaction, extracting sentence features from context and syntax views respectively sentences are extracted. Model the relationship among text, picture and aspect to realize multimodal interaction. At the same time, the interactive representation of different modalities is fused to dynamically obtain the contribution of visual information to each word in the text, and the correlation between modalities and aspects is fully extracted. Finally, the sentiment classification results are obtained through the fully connected layer and Softmax layer. Experiments on two datasets show that this model can effectively enhance the effect of multimodal aspect-level sentiment classification.

Key words: multimodal aspect-level sentiment analysis, pre-trained model, multiview learning, multimodal interaction, dynamic fusion

摘要： 以往的多模态方面级情感分析方法只利用预训练模型的一般文本和图片表示，对方面和观点词相关性的识别不敏感，且不能动态获取图片信息对单词表示的贡献，因而不能充分识别多模态与方面之间的相关性。针对上述问题，提出一种多视图交互学习网络模型。将句子从上下文和句法两个视图上分别提取特征，以便在多模态交互时充分利用到文本的全局特征；对文本、图片和方面之间的关系进行建模，使模型实现多模态交互；同时融合不同模态的交互表示，动态获取视觉信息对文本中每个单词的贡献程度，充分提取模态与方面之间的相关性。最后通过全连接层和Softmax层获取情感分类结果。在两个数据集上进行实验，实验结果表明该模型能够有效增强多模态方面级情感分类的效果。

关键词: 多模态方面级情感分析, 预训练模型, 多视图学习, 多模态交互, 动态融合

WANG Xuyang, PANG Wenqian, ZHAO Lijie. Multiview Interaction Learning Network for Multimodal Aspect-Level Sentiment Analysis[J]. Computer Engineering and Applications, 2024, 60(7): 92-100.

王旭阳, 庞文倩, 赵丽婕. 多模态方面级情感分析的多视图交互学习网络[J]. 计算机工程与应用, 2024, 60(7): 92-100.

References

[1] MEDHAT W, HASSAN A, KORASHY H. Sentiment analysis algorithms and applications: a survey[J]. Ain Shams Engineering Journal, 2014, 5(4): 1093-1113.
[2] ZHONG Q, DING L, LIU J, et al. Knowledge graph augmented network towards multiview representation learning for aspect-based sentiment analysis[J]. arXiv:2201.04831, 2022.
[3] ZHOU J, ZHAO J, HUANG J X, et al. MASAD: a large-scale dataset for multimodal aspect-based sentiment analysis[J]. Neurocomputing, 2021, 455: 47-58.
[4] ZHOU J, HUANG J X, CHEN Q, et al. Deep learning for aspect-level sentiment classification: survey, vision, and challenges[J]. IEEE Access, 2019, 7: 78454-78483.
[5] TANG D, QIN B, FENG X, et al. Effective LSTMs for target-dependent sentiment classification[J]. arXiv:1512.01100, 2015.
[6] WANG Y, HUANG M, ZHU X, et al. Attention-based LSTM for aspect-level sentiment classification[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016: 606-615.
[7] MA D, LI S, ZHANG X, et al. Interactive attention networks for aspect-level sentiment classification[J]. arXiv:1709.00893,2017.
[8] ZHANG C, LI Q, SONG D. Aspect-based sentiment classification with aspect-specific graph convolutional networks[J]. arXiv:1909.03477, 2019.
[9] PANG S, XUE Y, YAN Z, et al. Dynamic and multi-channel graph convolutional networks for aspect-based sentiment analysis[C]//Findings of the Association for Computational Linguistics (ACL-IJCNLP 2021), 2021: 2627-2636.
[10] CHEN G, TIAN Y, SONG Y. Joint aspect extraction and sentiment analysis with directional graph convolutional networks[C]//Proceedings of the 28th International Conference on Computational Linguistics, 2020: 272-279.
[11] LING Y, XIA R. Vision-language pre-training for multimodal aspect-based sentiment analysis[J]. arXiv:2204.07955, 2022.
[12] YANG L, NA J C, YU J. Cross-modal multitask transformer for end-to-end multimodal aspect-based sentiment analysis[J]. Information Processing & Management, 2022, 59(5): 103038.
[13] XU N, MAO W, CHEN G. Multi-interactive memory network for aspect based multimodal sentiment analysis[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019: 371-378.
[14] YU J, JIANG J, XIA R. Entity-sensitive attention and fusion network for entity-level multimodal sentiment classification[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2019, 28: 429-439.
[15] WU H, CHENG S, WANG J, et al. Multimodal aspect extraction with region-aware alignment network[C]//CCF International Conference on Natural Language Processing and Chinese Computing. Cham: Springer, 2020: 145-156.
[16] JU X, ZHANG D, XIAO R, et al. Joint multi-modal aspect-sentiment analysis with auxiliary cross-modal relation detection[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021: 4395-4405.
[17] HE R, LEE W S, NG H T, et al. Exploiting document knowledge for aspect-level sentiment classification[J]. arXiv:1806.
04346, 2018.
[18] KINGMA DIEDERIK P, ADAM J B. A method for stochastic optimization[J]. arXiv:1412.6980, 2014.
[19] TANG D, QIN B, LIU T. Aspect level sentiment classification with deep memory network[J]. arXiv:1605.08900, 2016.
[20] LIU F, COHN T, BALDWIN T. Recurrent entity networks with delayed memory update for targeted aspect-based sentiment analysis[J]. arXiv:1804.11019, 2018.
[21] YU J, JIANG J. Adapting BERT for target-oriented multimodal sentiment classification[C]//Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019: 5408-5414.
[22] WANG J, GU D, YANG C, et al. Targeted aspect based multimodal sentiment analysis: an attention capsule extraction and multi-head fusion network[J]. arXiv:2103.07659, 2021.