Multiscale Double-Layer Convolution and Global Feature Text Classification Model

doi:10.3778/j.issn.1002-8331.2209-0093

Abstract

Abstract: Aiming at bi-directional long short-term memory（BiLSTM） and convolution neural networks, due to the limitations of feature extraction of their respective models, the classification accuracy is not high. An improved two-layer CNN network and the introduction of attention mechanism BiLSTM joint model are proposed. Due to the limited ability of the single-layer CNN network to obtain local features, the model enhances the local sensitivity field of subsequent convolution by introducing up-sampling to the multi-scale combined convolution and skipping the original text to achieve the purpose of enhancing feature extraction and enriching semantics. At the same time, the attention mechanism is introduced into the BiLSTM model to make it pay attention to the features of key words, and the features of the improved CNN network and BiLSTM model are integrated. Finally, the full connection layer is input for the multi-classification task. The model is tested on two public Chinese news classification data sets. The experimental results show that the accuracy of the proposed model on the public data set THUCNews is 93.62%, 3.01 percentage points higher than that of the single channel common CNN model and 2.2 percentage points higher than that of the CNN-LSTM two-channel model.

Key words: text classification, bi-directional long short-term memory network（BiLSTM）, convolutional neural network（CNN）, attention mechanism

摘要： 针对双向长短时记忆网络（bi-directional long short-term memory，BiLSTM）和卷积神经网络（convolution neural network，CNN）因各自模型提取特征的局限性导致的分类准确率不高的问题，提出一种改进的双层CNN网络和引入注意力机制的BiLSTM联合模型。由于单层CNN网络获取局部特征能力有限，该模型通过对多尺度组合卷积引入上采样与原始文本进行跳跃连接来增强后续卷积的局部感受野，以达到增强特征提取丰富语义的目的。同时对BiLSTM模型引入注意力机制使之将注意力关注到重点单词特征上，将改进后的CNN网络和BiLSTM模型进行特征融合，最后输入全连接层进行多分类任务。模型在2个公开的中文新闻分类数据集上进行实验，实验结果显示所提出的模型在公开数据集THUCNews上的准确率为93.62%，比单通道普通CNN模型提高3.01个百分点，比CNN-LSTM双通道模型提高2.2个百分点。

关键词: 文本分类, 双向长短时记忆网络, 卷积神经网络, 注意力机制

SONG Zhongshan, NIU Yue, ZHENG Lu, TIE Jun, JIANG Hai. Multiscale Double-Layer Convolution and Global Feature Text Classification Model[J]. Computer Engineering and Applications, 2023, 59(20): 103-110.

宋中山, 牛悦, 郑禄, 帖军, 姜海. 多尺度CNN卷积与全局关系的中文文本分类模型[J]. 计算机工程与应用, 2023, 59(20): 103-110.

References

[1] KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Palo Alto，USA：AAAI Press，2014：1746-1751.
[2] SCHUSTER M，PALIWAL K K.Bidirectional recurrent neural networks[J].IEEE Transactions on Signal Processing，2002，45（11）：2673-2681.
[3] SPARCK JONES K.A statistical interpretation of term specificity and its application in retrieval[J].Journal of Documentation，1972，28（1）：11-21.
[4] RUAN S，CHEN B，SONE K，et al.Weighted Nave Bayes text classification algorithm based on improved distance correlation coefficient[J].Neural Computing and Applications，2021：1-10.
[5] HINDI K E，SHAWAR B A，ALJULAIDAN R，et al.Improved distance functions for instance-based text classification[J].Computational Intelligence and Neuroscience，2020（2）：1-10.
[6] FESSEHA A，XIONG S，EMIRU E D，et al.Text classification based on convolutional neural networks and word embedding for low-resource languages：Tigrinya[J].Information（Switzerland），2021，12（2）：52.
[7] KALCHBIENNER N，GREFENSTETTE E，BLUNSOM P，et al.A convolutional neural network for modeling sentences[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistic，2014：655-665.
[8] HOCHREITER S，SCHMIDHUBER J.Long short-term memory[J].Neural Computation，1997，9（8）：1735-1780.
[9] ZHOU P，SHI W，TIAN J，et al.Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics，2016：207-212.
[10] 李洋，董红斌.基于CNN和BiLSTM网络特征融合的文本情感分析[J].计算机应用，2018，38（11）：3075-3080.
LI Y，DONG H B.Text sentiment analysis based on feature fusion of convolution neural network and bidirectional long short-term memory network[J].Journal of Computer Applications，2018，38（11）：3075-3080.
[11] 滕金保，孔韦韦，田乔鑫，等.基于LSTM-Attention与CNN混合模型的文本分类方法[J].计算机工程与应用，2021，57（14）：126-133.
TENG J B，KONG W W，TIAN Q X，et al.Text classification method based on LSTM-Attention and CNN hybrid model[J].Computer Engineering and Applications，2021，57（14）：126-133.
[12] 杨兴锐，赵寿为，张如学，等.结合自注意力和残差的BiLSTM_
CNN文本分类模型[J].计算机工程与应用，2022，58（3）：172-180.
YANG X R，ZHAO S W，ZHANG R X，et al.BiLSTM_
CNN classification model based on self-attention and residual network[J].Computer Engineering and Applications，2022，58（3）：172-180.
[13] ZHOU P，QI Z，ZHENG S，et al.Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling[J].arXiv：1611.06639，2016.
[14] MIKOLOV T，CHEN K，CORRADO G，et al.Efficient estimation of word representations in vector space[J].arXiv：1301.3781，2013.
[15] LI Y，WANG X T，XU P G.Chinese text classification model based on deep learning[J].Future Internet，2018，10（11）：113-124.
[16] ZHANG Y，YUAN H，WANG J，et al.YNU-HPCC at EmoInt-2017：using a CNN-LSTM model for sentiment intensity prediction[C]//Proceedings of the 8th Workshop on Computational Approaches to Subjectivity，Sentiment and Social Media Analysis，2017.
[17] ZHANG Y S，ZHENG J，JIANG Y，et al.A text sentiment classification modeling method based on coordinated CNN-LSTM-Attention model[J].Chinese Journal of Electronics，2019，28（1）：120-126.
[18] SIVAKUMAR S，RAJALAKSHMI R.Analysis of sentiment on movie reviews using word embedding self-attentive LSTM[J].International Journal of Ambient Computing and Intelligence，2021，12（2）：1-20.
[19] XIE H T，FANG S C，ZHA Z J，et al.Convolutional attention networks for scene text recognition[J].ACM Transactions on Multimedia Computing，Communications，and Applications，2019，15（1）：1-17.