Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (2): 1-18.DOI: 10.3778/j.issn.1002-8331.2305-0439

• Research Hotspots and Reviews • Previous Articles     Next Articles

Survey of Sentiment Analysis Algorithms Based on Multimodal Fusion

GUO Xu, Mairidan Wushouer, Gulanbaier Tuerhong   

  1. School of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
  • Online:2024-01-15 Published:2024-01-15

基于多模态融合的情感分析算法研究综述

郭续,买日旦·吾守尔,古兰拜尔·吐尔洪   

  1. 新疆大学 信息科学与工程学院,乌鲁木齐 830046

Abstract: Sentiment analysis is an emerging technology that aims to explore people’s attitudes toward entities and can be applied to various domains and scenarios, such as product evaluation analysis, public opinion analysis, mental health analysis and risk assessment. Traditional sentiment analysis models focus on text content, yet some special forms of expression, such as sarcasm and hyperbole, are difficult to detect through text. As technology continues to advance, people can now express their opinions and feelings through multiple channels such as audio, images and videos, so sentiment analysis is shifting to multimodality, which brings new opportunities for sentiment analysis. Multimodal sentiment analysis contains rich visual and auditory information in addition to textual information, and the implied sentiment polarity (positive, neutral, negative) can be inferred more accurately using fusion analysis. The main challenge of multimodal sentiment analysis is the integration of cross-modal sentiment information; therefore, this paper focuses on the framework and characteristics of different fusion methods and describes the popular fusion algorithms in recent years, and discusses the current multimodal sentiment analysis in small sample scenarios, in addition to the current development status, common datasets, feature extraction algorithms, application areas and challenges. It is expected that this review will help researchers understand the current state of research in the field of multimodal sentiment analysis and be inspired to develop more effective models.

Key words: multimodal, emotional analysis, modal fusion

摘要: 情感分析是一项新兴技术,其旨在探索人们对实体的态度,可应用于各种领域和场景,例如产品评价分析、舆情分析、心理健康分析和风险评估。传统的情感分析模型主要关注文本内容,然而一些特殊的表达形式,如讽刺和夸张,则很难通过文本检测出来。随着技术的不断进步,人们现在可以通过音频、图像和视频等多种渠道来表达自己的观点和感受,因此情感分析正向多模态转变,这也为情感分析带来了新的机遇。多模态情感分析除了包含文本信息外,还包含丰富的视觉和听觉信息,利用融合分析可以更准确地推断隐含的情感极性(积极、中性、消极)。多模态情感分析面临的主要挑战是跨模态情感信息的整合,因此,重点介绍了不同融合方法的框架和特点,并对近几年流行的融合算法进行了阐述,同时对目前小样本场景下的多模态情感分析进行了讨论,此外,还介绍了多模态情感分析的发展现状、常用数据集、特征提取算法、应用领域和存在的挑战。期望此综述能够帮助研究人员了解多模态情感分析领域的研究现状,并从中得到启发,开发出更加有效的模型。

关键词: 多模态, 情感分析, 模态融合