Multi-channel Attention Mechanism Text Classification Model Based on CNN and LSTM

doi:10.3778/j.issn.1002-8331.2104-0212

Abstract

Abstract:

Aiming at the problem that traditional Convolutional Neural Network（CNN） and Long Short-Term Memory （LSTM） can not reflect the importance of each word in the text when extracting features, this paper proposes a multi-channel text classification model based on CNN and LSTM. Firstly, CNN and LSTM are used to extract the local information and context features of the text; secondly, multi-channel attention mechanism is used to extract the attention score of the output information of CNN and LSTM; finally, the output information of multi-channel attention mechanism is fused to achieve the effective extraction of text features and focus attention on important words. Experimental results on three public datasets show that the proposed model is better than CNN, LSTM and their improved models, and can effectively improve the effect of text classification.

Key words: text classification, Convolutional Neural Network（CNN）, Long Short-Term Memory （LSTM）, multi-channel attention, feature fusion

摘要：

针对传统的卷积神经网络（Convolutional Neural Network，CNN）和长短时记忆网络（Long Short-Term Memory，LSTM）在提取特征时无法体现每个词语在文本中重要程度的问题，提出一种基于CNN和LSTM的多通道注意力机制文本分类模型。使用CNN和LSTM提取文本局部信息和上下文特征；用多通道注意力机制（Attention）提取CNN和LSTM输出信息的注意力分值；将多通道注意力机制的输出信息进行融合，实现了有效提取文本特征的基础上将注意力集中在重要的词语上。在三个公开数据集上的实验结果表明，提出的模型相较于CNN、LSTM及其改进模型效果更好，可以有效提高文本分类的效果。

关键词: 文本分类, 卷积神经网络（CNN）, 长短时记忆网络（LSTM）, 多通道注意力, 特征融合

TENG Jinbao, KONG Weiwei, TIAN Qiaoxin, WANG Zhaoqian, LI Long. Multi-channel Attention Mechanism Text Classification Model Based on CNN and LSTM[J]. Computer Engineering and Applications, 2021, 57(23): 154-162.

滕金保，孔韦韦，田乔鑫，王照乾，李龙. 基于CNN和LSTM的多通道注意力机制文本分类模型[J]. 计算机工程与应用, 2021, 57(23): 154-162.

References

[1] LIU M Z，ZHOU F Y，CHEN K，et al.Co-attention networks based on aspect and context for aspect-level sentiment analysis[J].Knowledge-Based Systems，2021，217：106810.
[2] XU Z，ZHU Y，PENG Q，et al.Using deep belief network to demote web spam[J].Future Generation Computer Systems，2021，118：94-106.
[3] CHEN L，LIU Y，CHANG Y D，et al.Public opinion analysis of novel coronavirus from online data[J].Journal of Safety Science and Resilience，2020，1（2）：120-127.
[4] SHAH K，PATEL H，SANGHVI D，et al.A comparative analysis of logistic regression，random forest and KNN models for the text classification[J].Augmented Human Research，2020，5（1）：5-12.
[5] HINDI K E，SHAWAR B A，ALJULAIDAN R，et al.Improved distance functions for instance-based text classification[J].Computational Intelligence and Neuroscience，2020：4717984.
[6] YING Y，MURSITAMA T N，SHIDARTA，et al.Effectiveness of the news text classification test using the Na?ve Bayes’ classification text mining method[J].Journal of Physics：Conference Series，2021，1764（1）：012105.
[7] CHEN G N，DAI Z B，DUAN J T，et al.Improved Naive Bayes with optimal correlation factor for text classification[J].SN Applied Sciences，2019，1（9）：1-10.
[8] BEAKCHEOL J，INHWAN K，JONG K W.Word2vec convolutional neural networks for classification of news articles and tweets[J].Plos One，2019，14（8）：e0220976.
[9] MANAL M，NAZLIA O.Question classification based on Bloom’s taxonomy cognitive domain using modified TF-IDF and word2vec[J].Plos One，2020，15（3）：e0230442.
[10] HE S，LU Y Y.A modularized architecture of multi-branch convolutional neural network for image captioning[J].Electronics，2019，8（12）：1417-1432.
[11] FESSEHA A，XIONG S W，EMIRU E D，et al.Text classification based on convolutional neural networks and word embedding for low-resource languages：Tigrinya[J].Information，2021，12（2）：52.
[12] QIU N J，CONG L，ZHOU S C，et al.Barrage text classification with improved active learning and CNN[J].Journal of Advanced Computational Intelligence and Intelligent Informatics，2019，23（6）：980-989.
[13] 高云龙，吴川，朱明.基于改进卷积神经网络的短文本分类模型[J].吉林大学学报（理学版），2020，58（4）：923-930.
GAO Y L，WU C，ZHU M.Short text classification model based on improved convolutional neural network[J].Journal of Jilin University（Science Edition），2020，58（4）：923-930.
[14] AYYAPPA C M，THILLAIKARASI M，BHANU P B.Classification of social media text spam using VAE-CNN and LSTM model[J].IIETA，2020，25（6）：747-753.
[15] IBRAHIM M A，GHANI K M U，MEHMOOD F，et al.GHS-NET a generic hybridized shallow neural network for multi-label biomedical text classification[J].Journal of Biomedical Informatics，2021，116：103699.
[16] ZHANG Y S，ZHENG J，JIANG Y R，et al.A text sentiment classification modeling method based on coordinated CNN-LSTM-Attention model[J].Chinese Journal of Electronics，2019，28（1）：120-126.
[17] BAHDANAU D，CHO K，BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv：1409.0473，2014.
[18] ZHANG D J，HONG M B，ZOU L，et al.Attention pooling-based bidirectional GRU model for sentimental classification[J].International Journal of Computational Intelligence Systems，2019，12（2）：723-732.
[19] LI X，CUI M L，LI J P，et al.A hybrid medical text classification framework：integrating attentive rule construction and neural network[J].Neurocomputing，2021，443：345-355.
[20] LI S Y，PAN R，LUO H Y，et al.Adaptive cross-contextual word embedding for word polysemy with unsupervised topic modeling[J].Knowledge-Based Systems，2021，218：106827.
[21] MIKOLOV T，CHEN K，CORRADO G，et al.Efficient estimation of word representations in vector space[EB/OL].（2013-09-07）[2021-03-10].https：//arxiv.org/abs/1301.3781.
[22] HUANG T，ZHANG Q，TANG X A，et al.A novel fault diagnosis method based on CNN and LSTM and its application in fault diagnosis for complex systems[J].Artificial Intelligence Review，2021.
[23] CHENG Y H，CHANG P C，NGUYEN D M，et al.Automatic music genre classification based on CRNN[J].Engineering Letters，2021，29（1）.
[24] LI Y，WANG X T，XU P G.Chinese text classification model based on deep learning[J].Future Internet，2018，10（11）：113-124.
[25] XIE H T，FANG S C，ZHA Z J，et al.Convolutional attention networks for scene text recognition[J].ACM Transactions on Multimedia Computing，Communications，and Applications，2019，15（1s）：1-17.
[26] SIVAKUMAR S，RAJALAKSHMI R.Analysis of sentiment on movie reviews using word embedding self-attentive LSTM[J].International Journal of Ambient Computing and Intelligence，2021，12（2）：1-20.
[27] ZHANG Y S，ZHENG J，JIANG Y，et al.A text sentiment classification modeling method based on coordinated CNN-LSTM-Attention model[J].Chinese Journal of Electronics，2019，28（1）：120-126.
[28] LI C H，ZHANG C Y，FU Q.Research on CNN+LSTM user intention classification based on multi-granularity features of texts[J].The Journal of Engineering，2020，2020（13）：486-490.