基于注意力机制的双通道DAC-RNN文本分类模型

doi:10.3778/j.issn.1002-8331.2101-0196

摘要/Abstract

摘要： 在对中文文本进行分类的过程中，由于关键特征在整个文本中具有分布不均匀的特点，容易出现关键特征丢失的问题，降低了分类的准确性。针对这一问题，提出一种基于注意力机制的双通道文本分类模型。将输入文本通过词嵌入进行向量表示，利用Bi-LSTM通道提取文本中的上下文关联信息，利用CNN通道提取文本中连续词间的局部特征。在两个通道中均引入注意力机制进行全局权重分配，使模型能够进一步关注到文本中的关键词。在CNN通道中，将原始输入向量与各层CNN的输出向量进行选择性融合，从而实现特征重利用。在今日头条和THUCNews两个公开数据集上进行性能评估，实验结果表明，与其他分类模型相比，所提模型的分类准确率分别为97.59%、90.09%，具有更好的分类性能。

关键词: 文本分类, 卷积神经网络（CNN）, 注意力机制, 双通道, 特征重利用

Abstract: In the process of classifying Chinese text, because the key features are unevenly distributed throughout the text, the problem of key feature loss is prone to occur, which reduces the accuracy of the classification. To solve this problem, a dual-channel text classification model based on attention mechanism is proposed. Firstly, the input text is represented by word embedding for vector representation, the Bi-LSTM channel is used to extract the contextual information in the text, and the CNN channel is used to extract the local features between consecutive words in the text. Secondly, the attention mechanism is introduced in both channels for global weight distribution, so that the model further pays attention to the keywords in the text. In addition, in the CNN channel, the original input vector and the output vector of each layer of CNN are selectively fused to realize feature reuse. The performance evaluation is conducted on two public datasets of Toutiao and THUCNews. The experimental results show that compared with other classification models, the classification accuracy of the proposed model is 97.59% and 90.09% respectively, which has better classification performance.

Key words: text classification, convolutional neural network（CNN）, attention mechanism, dual-channel, feature reuse

李启行, 廖薇, 孟静雯. 基于注意力机制的双通道DAC-RNN文本分类模型[J]. 计算机工程与应用, 2022, 58(16): 157-163.

LI Qihang, LIAO Wei, MENG Jingwen. Dual-Channel DAC-RNN Text Classification Model Based on Attention Mechanism[J]. Computer Engineering and Applications, 2022, 58(16): 157-163.

参考文献

[1] 周志阳，冯百明，杨朋霖，等.基于Storm的流数据KNN分类算法的研究与实现[J].计算机工程与应用，2017，53（19）：71-75.
ZHOU Z Y，FENG B M，YANG P L，et al.Research and implementation of KNN classification algorithm for streaming data based on Storm[J].Computer Engineering and Applications，2017，53（19）：71-75.
[2] WU H，LI D，CHENG M.Chinese text classification based on character-level CNN and SVM[J].International Journal of Intelligent Information and Database Systems，2019，12（3）：212-228.
[3] PAVLINEK M，PODGORELEC V.Text classification method based on self-training and LDA topic models[J].Expert Systems with Applications，2017，80：83-93.
[4] JOHNSON R，ZHANG T.Effective use of word order for text categorization with convolutional neural networks[J].arXiv：1412.1058，2014.
[5] 张宇艺，左亚尧，陈小帮.基于改进的CBOW与ABiGRU的文本分类研究[J].计算机工程与应用，2019，55（24）：135-140.
ZHANG Y Y，ZUO Y Y，CHEN X B.Text classification research based on improved CBOW and ABiGRU[J].Computer Engineering and Applications，2019，55（24）：135-140.
[6] 林奕欧，雷航，李晓瑜，等.自然语言处理中的深度学习：方法及应用[J].电子科技大学学报，2017，46（6）：913-919.
LIN Y O，LEI H，LI X Y，et al.Deep learning in NLP：methods and applications[J].Journal of University of Electronic Science and Technology of China，2017，46（6）：913-919.
[7] KIM Y.Convolutional neural networks for sentence classification[J].arXiv：1408.5882，2014.
[8] 夏从零，钱涛，姬东鸿.基于事件卷积特征的新闻文本分类[J].计算机应用研究，2017，34（4）：991-994.
XIA C L，QIAN T，JI D H.Event convolutional feature based news documents classification[J].Application Research of Computers，2017，34（4）：991-994.
[9] WANG Z，QU Z.Research on Web text classification algorithm based on improved CNN and SVM[C]//2017 IEEE 17th International Conference on Communication Technology，2017：1958-1961.
[10] PHAM N Q，KRUSZEWSKI G，BOLEDA G.Convolutional neural network language models[C]//2016 Conference on Empirical Methods in Natural Language Processing，2016：1153-1162.
[11] ZHANG J，LI Y，TIAN J，et al.LSTM-CNN hybrid model for text classification[C]//2018 IEEE 3rd Advanced Information Technology，Electronic and Automation Control Conference，2018：1675-1680.
[12] 李洋，董红斌.基于CNN和BiLSTM网络特征融合的文本情感分析[J].计算机应用，2018，38（11）：3075-3080.
LI Y，DONG H B.Text sentiment analysis based on feature fusion of convolution neural network and bidirectional long short-term memory network[J].Journal of Computer Applications，2018，38（11）：3075-3080.
[13] 陶志勇，李小兵，刘影，等.基于双向长短时记忆网络的改进注意力短文本分类方法[J].数据分析与知识发现，2019，3（12）：21-29.
TAO Z Y，LI X B，LIU Y，et al.Classifying short texts with improved-attention based bidirectional long memory network[J].Data Analysis and Knowledge Discovery，2019，3（12）：21-29.
[14] LIU G，GUO J.Bidirectional LSTM with attention mechanism and convolutional layer for text classification[J].Neurocomputing，2019，337：325-338.
[15] ZENG J，MA X，ZHOU K.Enhancing attention-based LSTM with position context for aspect-level sentiment classification[J].IEEE Access，2019，7：20462-20471.
[16] ZHOU C，SUN C，LIU Z，et al.A C-LSTM neural network for text classification[J].arXiv：1511.08630，2015.
[17] 杨春霞，李锐，秦家鹏.一种粒度融合的新闻文本主题分类模型[J].小型微型计算机系统，2020，41（11）：2256-2259.
YANG C X，LI R，QIN J P.Granular fusion news text topic classification model[J].Journal of Chinese Computer Systems，2020，41（11）：2256-2259.
[18] 王晓明.基于深度学习的中文文本分类的关键技术研究[D].成都：电子科技大学，2020.
WANG X M.Research on key technologies of Chinese text classification based on deep learning[D].Chengdu：University of Electronic Science and Technology of China，2020.