计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (19): 205-213.DOI: 10.3778/j.issn.1002-8331.2104-0308

• 模式识别与人工智能 • 上一篇    下一篇

多特征混合模型文本情感分析方法

李文亮,杨秋翔,秦权   

  1. 中北大学 软件学院,太原 030051
  • 出版日期:2021-10-01 发布日期:2021-09-29

Multi-feature Mixed Model Text Sentiment Analysis Method

LI Wenliang, YANG Qiuxiang, QIN Quan   

  1. College of Software, North University of China, Shanxi 030051, China
  • Online:2021-10-01 Published:2021-09-29

摘要:

近年来,深度学习被广泛应用于文本情感分析。其中文本卷积神经网络(TextCNN)最具代表性,但是TxetCNN的语义特征提取存在词嵌入维度语义特征丢失、最大池化算法特征提取不足和文本长期依赖关系丢失的问题。针对以上问题,提出多特征混合模型(BiLSTM-MFCNN)的文本情感分析方法。该方法使用双向长短记忆网络(BiLSTM)学习文本的长期依赖关系;改进TextCNN的卷积层和池化层提出多特征卷积神经网络(MFCNN),卷积层利用五种不同的卷积算法,分别从句子维度、整个词嵌入维度、单个词嵌入维度、相邻词向量维度和单个词向量维度提取文本的语义特征,池化层利用最大池化算法和平均池化算法,获取文本的情感特征。在中文NLPCC Emotion Classification Challenge和COAE2014数据集、英文Twitter数据集进行对比实验,实验结果表明该混合模型在文本情感分析任务中能够取得更好的效果。

关键词: 文本情感分析, 混合模型, 双向长短记忆网络(BiLSTM), 多特征卷积神经网络(MFCNN)

Abstract:

In recent years, deep learning has been widely used in sentiment analysis. Among them, the Text Convolutional Neural Network(TextCNN) is best method, but TextCNN has some problems, which lack the semantic features of word embedding dimension and long-term dependence, and exist deficiency in maximum pooling. According to the states mentioned above, this paper proposes a text sentiment analysis method based on the multi-feature mixed model(BiLSTM-MFCNN). Firstly, Bidirectional Long Short-Term Memory(BiLSTM)  network is used to extract the long-term dependencies. Then, it improves the convolutional layer and pooling layer of TextCNN and proposes MFCNN(Multiple Features Convolution Neural Network). The convolutional layer uses five different convolution algorithms, from the sentence dimension, the entire word embedding dimension, the single word embedding dimension, the adjacent word vector dimension and the single word vector dimension, to extract the semantic features. The pooling layer uses the maximum pooling algorithm and the average pooling algorithm to obtain the semantic features. Finally, the experiments on the NLPCC Emotion Classification Challenge dataset, the COAE2014 dataset, and the Twitter dataset reveal that the hybrid model has better performance in sentiment analysis.

Key words: sentiment analysis, hybrid model, Bidirectional Long Short-Term Memory(BiLSTM) network;Multiple Features Convolution Neural Network(MFCNN)