计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (24): 135-140.DOI: 10.3778/j.issn.1002-8331.1809-0015

• 模式识别与人工智能 • 上一篇    下一篇

基于改进的CBOW与ABiGRU的文本分类研究

张宇艺,左亚尧,陈小帮   

  1. 广东工业大学 计算机学院,广州 510006
  • 出版日期:2019-12-15 发布日期:2019-12-11

Text Classification Research Based on Improved CBOW and ABiGRU

ZHANG Yuyi, ZUO Yayao, CHEN Xiaobang   

  1. Faculty of Computer, Guangdong University of Technology, Guangzhou 510006, China
  • Online:2019-12-15 Published:2019-12-11

摘要: 文本的表示与文本的特征提取是文本分类需要解决的核心问题,基于此,提出了基于改进的连续词袋模型(CBOW)与ABiGRU的文本分类模型。该分类模型把改进的CBOW模型所训练的词向量作为词嵌入层,然后经过卷积神经网络的卷积层和池化层,以及结合了注意力(Attention)机制的双向门限循环单元(BiGRU)神经网络充分提取了文本的特征。将文本特征向量输入到softmax分类器进行分类。在三个语料集中进行的文本分类实验结果表明,相较于其他文本分类算法,提出的方法有更优越的性能。

关键词: 深度学习, 连续词袋模型(CBOW), 注意力机制, 神经网络, 文本分类

Abstract: The representation and the feature extraction of text are the core problems that need to be solved in text classification. Based on this, a text classification model based on improved Continuous Bag-of-Words(CBOW) and ABiGRU is proposed. The classification model uses the word vector trained by the improved CBOW model as a word embedding layer, and then the features of the text are fully extracted through the convolutional and pooling layers of the convolutional neural network and the bidirectional gated recurrent unit neural network combined with the attention mechanism. The text feature vector is input to the softmax classifier for classification. In this paper, text categorization experiments are carried out in three datasets, the experimental results show that the proposed method has better performance than other text categorization algorithms.

Key words: deep learning, Continuous Bag-of-Word(CBOW), attention mechanism, neural network, text classification