Computer Engineering and Applications ›› 2019, Vol. 55 ›› Issue (5): 135-142.DOI: 10.3778/j.issn.1002-8331.1803-0090

Previous Articles     Next Articles

Character-Level Convolutional Neural Networks for Short Text Classification

LIU Jingxue1, MENG Fanrong1, ZHOU Yong1, LIU Bing1,2   

  1. 1.College of Computer Science and Technology, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
    2.Insititute of Electrics, Chinese Academy of Sciences, Beijing 100080, China
  • Online:2019-03-01 Published:2019-03-06

字符级卷积神经网络短文本分类算法

刘敬学1,孟凡荣1,周  勇1,刘  兵1,2   

  1. 1.中国矿业大学 计算机科学与技术学院,江苏 徐州 221116
    2.中国科学院 电子研究所,北京 100080

Abstract: Since short text is characterized of the short length, sparse features and strong context dependency, the traditional models have a limited precision. Motivated by this, this article offers an empirical exploration on a character-level model which implements a combination of Convolutional Neural Network(CNN) and Long Short-Term Memory neural networks(LSTM) for short text classification. Including the highway networks framework so that it can address the difficult of training and improve the accuracy of classification. The evaluations on several datasets show that the proposed model outperforms the traditional and CNN-based models on short text classification mission.

Key words: character-level, neural network, text classification, highway networks

摘要: 由于短文本具有长度短、特征稀疏以及上下文依赖性强等特点,传统方法对其直接进行分类精度有限。针对该问题,提出了一种基于字符级嵌入的卷积神经网络(CNN)和长短时记忆网络(LSTM)相结合的神经网络模型进行短文本的分类。该模型同时包括了高速公路网络(Highway networks)框架,用于缓解深度神经网络训练时的困难,提高分类的准确性。通过对几种数据集的测试,结果表明提出的模型在短文本分类任务中优于传统模型和其他基于CNN的分类模型。

关键词: 字符级, 神经网络, 文本分类, 高速公路网络