Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (11): 119-127.DOI: 10.3778/j.issn.1002-8331.2012-0293

Previous Articles     Next Articles

Bidirectional Attention Mechanism Based Multimodal Sentiment Classification Method

HUANG Hongzhan, MENG Zuqiang   

  1. College of Computer and Electronics Information, Guangxi University, Nanning 530004, China
  • Online:2021-06-01 Published:2021-05-31



  1. 广西大学 计算机与电子信息学院,南宁 530004


The development of social network provides large amounts of multimodal data for sentiment analysis. Sentiment classification based on multimodal content can exploit the relative information between various modalities, avoiding the incomplete grasp of the overall emotion. Simple fusion methods cannot fully excavate the complementary characteristics of multiple modalities, therefore a Multimodal Bidirectional Attention Hybrid model(MBAH) is proposed. Based on the image and text features extracted from the deep models, another modality information is introduced on modality by bidirectional attention mechanism and the low-level features of this modality and the semantic features of another modality are calculated to learn the association information between the modalities through attention. Then it assembles the high-level features of the two modalities to form a cross-modal shared representation and inputs into the multilayer perceptron to obtain the classification result. In addition, the MBAH model combines with the image-textunimodal self-attention models search for the optimal decision weights through late fusion to form the final decision. Experimental result shows that the MBAH model outperforms other methods on sentiment classification.

Key words: sentiment classification, multimodal data, bidirectional attention mechanism, late fusion


社交网络的发展为情感分析研究提供了大量的多模态数据。结合多模态内容进行情感分类可以利用模态间数据的关联信息,从而避免单一模态对总体情感把握不全面的情况。使用简单的共享表征学习方法无法充分挖掘模态间的互补特征,因此提出多模态双向注意力融合(Multimodal Bidirectional Attention Hybrid, MBAH)模型,在深度模型提取的图像和文本特征基础上,利用双向注意力机制在一个模态下引入另一个模态信息,将该模态的底层特征与另一模态语义特征通过注意力计算学习模态间的关联信息,然后联结两种模态的高层特征形成跨模态共享表征并输入多层感知器得到分类结果。此外MBAH模型应用后期融合技术结合图文单模态自注意力模型搜寻最优决策权值,形成最终决策。实验结果表明,MBAH模型情感分类结果相较于其他方法具有明显的提升。

关键词: 情感分类, 多模态数据, 双向注意力机制, 后期融合