Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (23): 154-162.DOI: 10.3778/j.issn.1002-8331.2104-0212

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

Multi-channel Attention Mechanism Text Classification Model Based on CNN and LSTM

TENG Jinbao, KONG Weiwei, TIAN Qiaoxin, WANG Zhaoqian, LI Long   

  1. 1.Xi’an University of Posts and Telecommunications, Xi’an 710121, China
    2.Guilin University of Electronic Technology, Guilin, Guangxi 541004, China
    3.Shaanxi Provincial Key Laboratory of Network Data Analysis and Intelligent Processing, Xi’an 710121, China
    4.Guangxi Key Laboratory of Trusted Software, Guilin, Guangxi 541004, China
  • Online:2021-12-01 Published:2021-12-02

基于CNN和LSTM的多通道注意力机制文本分类模型

滕金保,孔韦韦,田乔鑫,王照乾,李龙   

  1. 1.西安邮电大学,西安 710121
    2.桂林电子科技大学,广西 桂林 541004
    3.陕西省网络数据分析与智能处理重点实验室,西安 710121
    4.广西可信软件重点实验室,广西 桂林 541004

Abstract:

Aiming at the problem that traditional Convolutional Neural Network(CNN) and Long Short-Term Memory (LSTM) can not reflect the importance of each word in the text when extracting features, this paper proposes a multi-channel text classification model based on CNN and LSTM. Firstly, CNN and LSTM are used to extract the local information and context features of the text; secondly, multi-channel attention mechanism is used to extract the attention score of the output information of CNN and LSTM; finally, the output information of multi-channel attention mechanism is fused to achieve the effective extraction of text features and focus attention on important words. Experimental results on three public datasets show that the proposed model is better than CNN, LSTM and their improved models, and can effectively improve the effect of text classification.

Key words: text classification, Convolutional Neural Network(CNN), Long Short-Term Memory (LSTM), multi-channel attention, feature fusion

摘要:

针对传统的卷积神经网络(Convolutional Neural Network,CNN)和长短时记忆网络(Long Short-Term Memory,LSTM)在提取特征时无法体现每个词语在文本中重要程度的问题,提出一种基于CNN和LSTM的多通道注意力机制文本分类模型。使用CNN和LSTM提取文本局部信息和上下文特征;用多通道注意力机制(Attention)提取CNN和LSTM输出信息的注意力分值;将多通道注意力机制的输出信息进行融合,实现了有效提取文本特征的基础上将注意力集中在重要的词语上。在三个公开数据集上的实验结果表明,提出的模型相较于CNN、LSTM及其改进模型效果更好,可以有效提高文本分类的效果。

关键词: 文本分类, 卷积神经网络(CNN), 长短时记忆网络(LSTM), 多通道注意力, 特征融合