Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (14): 126-133.DOI: 10.3778/j.issn.1002-8331.2011-0037

Previous Articles     Next Articles

Text Classification Method Based on LSTM-Attention and CNN Hybrid Model

TENG Jinbao, KONG Weiwei, TIAN Qiaoxin, WANG Zhaoqian   

  1. 1.Xi’an University of Posts and Telecommunications, Xi’an 710121, China
    2.Shaanxi Provincial Key Laboratory of Network Data Analysis and Intelligent Processing, Xi’an 710121, China
  • Online:2021-07-15 Published:2021-07-14

基于LSTM-Attention与CNN混合模型的文本分类方法

滕金保,孔韦韦,田乔鑫,王照乾   

  1. 1.西安邮电大学,西安 710121
    2.陕西省网络数据分析与智能处理重点实验室,西安 710121

Abstract:

For the problem that traditional Long Short-Term Memory(LSTM) and Convolution Neural Network(CNN) cannot reflect the importance of each word in the text when extracting features, a text classification method based on the hybrid model of LSTM-Attention and CNN is proposed. Firstly, CNN is used to extract the local information of the text and then integrate the semantics of the whole text. Secondly, LSTM is used to extract text context features. After LSTM, Attention mechanism is added to extract the Attention score of output information. Finally, the output of LSTM-Attention is fused with the output of CNN, so as to realize the effective extraction of text features and focus Attention on important words. The experimental results on three open data sets show that the proposed model is more effective than LSTM, CNN and their improved models, and can effectively improve the effect of text classification.

Key words: text classification, Long Short-Term Memory(LSTM), attention mechanism, Convolution Neural Network(CNN), feature fusion

摘要:

针对传统长短时记忆网络(Long Short-Term Memory,LSTM)和卷积神经网络(Convolution Neural Network,CNN)在提取特征时无法体现每个词语在文本中重要程度的问题,提出一种基于LSTM-Attention与CNN混合模型的文本分类方法。使用CNN提取文本局部信息,进而整合出全文语义;用LSTM提取文本上下文特征,在LSTM之后加入注意力机制(Attention)提取输出信息的注意力分值;将LSTM-Attention的输出与CNN的输出进行融合,实现了有效提取文本特征的基础上将注意力集中在重要的词语上。在三个公开数据集上的实验结果表明,提出的模型相较于LSTM、CNN及其改进模型效果更好,可以有效提高文本分类的效果。

关键词: 文本分类, 长短时记忆网络(LSTM), 注意力机制, 卷积神经网络(CNN), 特征融合