计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (4): 43-53.DOI: 10.3778/j.issn.1002-8331.2209-0048

• 热点与综述 • 上一篇    下一篇

基于深度学习的短文本分类方法研究综述

淦亚婷,安建业,徐雪   

  1. 天津商业大学 理学院,天津 300134
  • 出版日期:2023-02-15 发布日期:2023-02-15

Survey of Short Text Classification Methods Based on Deep Learning

GAN Yating, AN Jianye, XU Xue   

  1. School of Science, Tianjin University of Commerce, Tianjin 300134, China
  • Online:2023-02-15 Published:2023-02-15

摘要: 从CNN、RNN、CNN-RNN、GCN及其他深度学习方法五方面,全面分析了深度学习在短文本分类应用中的研究现状,比较了各自的优缺点,总结了常用的标签数据集。结果表明:目前深度学习在短文本分类中的应用研究主要集中在高效算法改进以及文本信息拓展两方面;对模型检验中构建标签数据集的研究也处于起步阶段,大多是针对影评、商品评论、新闻等特定领域的,还需不断完善;基于深度学习的短文本分类方法研究,今后在理论研究方面将重点关注算法改进、信息拓展以及二者的相互融合,在实践中探索某些分类效果较好的特定领域应用。

关键词: 短文本, 文本分类, 深度学习, 文本表示

Abstract: From five aspects of CNN, RNN, CNN-RNN, GCN and other deep learning methods, the research status of their application in short text classification is comprehensively analyzed, their advantages and disadvantages are compared, and the commonly used labeled datasets are summarized. The results show that:At present, the application research of deep learning in short text classification mainly focuses on the improvement of efficient algorithms and the expansion of text information. At the same time, the research on constructing labeled datasets for model testing is in the initial stage, mostly for specific fields such as movie reviews, commodity reviews, news, etc., which needs continuous improvement. In the future, the research will focus on algorithm improvement, information expansion and their mutual integration, to explore some specific applications with good classification effect in practice.

Key words: short text, text classification, deep learning, text representation