计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (9): 237-243.DOI: 10.3778/j.issn.1002-8331.2302-0396

• 模式识别与人工智能 • 上一篇    下一篇

混合特征及多头注意力的中文短文本分类

江结林,朱永伟,许小龙,崔燕,赵英男   

  1. 1.南京信息工程大学 软件学院,南京 210044
    2.南京信息工程大学 江苏省大气环境与装备技术协同创新中心,南京 210044
    3.南京特殊教育师范学院 数学与信息科学学院,南京 210038
    4.南京信息工程大学 计算机学院、网络空间安全学院,南京 210044
  • 出版日期:2024-05-01 发布日期:2024-04-29

Chinese Short Text Classification with Hybrid Features and Multi-Head Attention

JIANG Jielin, ZHU Yongwei, XU Xiaolong, CUI Yan, ZHAO Yingnan   

  1. 1.School of Software, Nanjing University of Information Science and Technology, Nanjing 210044, China
    2.Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing University of Information Science & Technology, Nanjing 210044, China
    3.College of Mathematics and Information Science, Nanjing Normal University of Special Education, Nanjing 210038, China
    4.School of Computer Science, School of Cyber Science and Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China
  • Online:2024-05-01 Published:2024-04-29

摘要: 传统的短文本分类研究方法存在两方面不足,一是不能全面地表示文本的语义信息,二是无法充分地提取和融合文本全局和局部信息。基于此,提出一种混合特征及多头注意力(HF-MHA)的中文短文本分类方法。该方法利用预训练模型计算中文短文本的字符级向量和词级向量表示,以得到更全面的文本特征向量表示;采用多头注意力机制捕捉文本序列中的依赖关系,以提高文本的语义理解;通过卷积神经网络分别提取两种向量表示的特征,并将其融合为一个特征向量,以整合文本的全局和局部信息;通过输出层得到分类结果。在三个公开数据集上的实验表明,HF-MHA能够有效地提升中文短文本分类的性能。

关键词: 中文短文本分类, 注意力机制, 词级向量, 字符级向量

Abstract: Traditional short text classification methods have two shortcomings:they cannot fully represent the semantic information of the text, and they cannot effectively extract and integrate the global and local information of the text. Based on this, a Chinese short text classification with hybrid features and multi-head attention (HF-MHA) is proposed. The method uses a pre-trained model to calculate the character-level and word-level vector representations of Chinese short texts, to obtain a more comprehensive text feature vector representation. Then it adopts a multi-head attention mechanism to capture the dependency relationships in the text sequence, to improve the semantic understanding of the text. It uses a convolutional neural network to extract the features of the two vector representations separately, and integrates them into a feature vector, to integrate the global and local information of the text. Finally, it obtains the classification result through the output layer. Experiments on three public datasets show that HF-MHA can effectively improve the performance of Chinese short text classification.

Key words: Chinese short text classification, attention mechanism, word-level vector, character-level vector