Multi-label Text Classification Based on Joint Model

doi:10.3778/j.issn.1002-8331.1904-0273

Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (14): 111-117.DOI: 10.3778/j.issn.1002-8331.1904-0273

Previous Articles Next Articles

Multi-label Text Classification Based on Joint Model

LIU Xinhui, CHEN Wenshi, ZHOU Ai, CHEN Fei, QU Wen, LU Mingyu

College of Information Science and Technology, Dalian Maritime University, Dalian, Liaoning 116026, China

Online:2020-07-15 Published:2020-07-14

基于联合模型的多标签文本分类研究

刘心惠，陈文实，周爱，陈飞，屈雯，鲁明羽

大连海事大学信息科学技术学院，辽宁大连 116026

Abstract

Abstract:

At present, the multi-label text classification algorithm ignores the importance of different words in text sequences and the influence of different levels of text features. This paper proposes an ATT-Capsule-BiLSTM method based on multi-head attention, CapsuleNet and the Bidirectional Long Short-Term Memory network（BiLSTM） model. Firstly, the text sequence is vectorized, and the weight distribution of the words is learned by multi-head attention on the basis of the word vector. Then the feature representation of the local spatial information and the context timing information are extracted by the Capsule network and BiLSTM respectively, and the fusion is performed through the fusion layer. After that, it is classified by the sigmoid classifier. The comparison experiments are carried out on two data sets, Reuters-21578 and AAPD. The experimental results show that the proposed joint model achieves better performance based on simple architecture. The [F1] values reach 89.82% and 67.48% respectively.

Key words: multi-label text classification, multi-head attention, CapsuleNet, Bidirectional Long Short-Term Memory network（BiLSTM）, joint model

摘要：

目前大部分多标签文本分类算法忽视文本序列中不同词的重要程度、不同层次文本特征的影响，提出一种ATT-Capsule-BiLSTM模型，使用多头注意力机制（Multi-head Attention），结合胶囊网络（CapsuleNet）与双向长短期记忆网络（BiLSTM）方法。将文本序列向量化表示，在词向量的基础上通过多头注意力机制学习单词的权重分布。通过胶囊网络和BiLSTM分别提取局部空间信息和上下文时序信息的特征表示，通过平均融合后，由sigmoid分类器进行分类。在Reuters-21578和AAPD两个数据集上进行对比实验，实验结果表明，提出的联合模型在使用简单架构的情况下，达到了较好的性能，[F1]值分别达到了89.82%和67.48%。

关键词: 多标签文本分类, 多头注意力机制, 胶囊网络, 双向长短期记忆网络, 联合模型

LIU Xinhui, CHEN Wenshi, ZHOU Ai, CHEN Fei, QU Wen, LU Mingyu. Multi-label Text Classification Based on Joint Model[J]. Computer Engineering and Applications, 2020, 56(14): 111-117.

刘心惠，陈文实，周爱，陈飞，屈雯，鲁明羽. 基于联合模型的多标签文本分类研究[J]. 计算机工程与应用, 2020, 56(14): 111-117.

[1]	LIU Bowen, FAN Chunxiao. Relation Extraction Based on CapsuleNet via Position Perception [J]. Computer Engineering and Applications, 2021, 57(6): 101-107.
[2]	ZHAI Yiming, WANG Binjun, ZHOU Zhining, TONG Xin. Multi-head Attention Pooling-Based RCNN Model for Text Classification [J]. Computer Engineering and Applications, 2021, 57(12): 155-160.
[3]	HAO Chao, QIU Hangping, SUN Yi, ZHANG Chaoran. Research Progress of Multi-label Text Classification [J]. Computer Engineering and Applications, 2021, 57(10): 48-56.
[4]	SHI Kai, HU Yan. Multi-Head Attention and Semantic Video Captioning [J]. Computer Engineering and Applications, 2020, 56(6): 133-139.
[5]	HUA Bingtao, YUAN Zhixiang, XIAO Weimin, ZHENG Xiao. Joint Slot Filling and Intent Detection with BLSTM-CNN-CRF [J]. Computer Engineering and Applications, 2019, 55(9): 139-143.
[6]	LIU Hefei1, CHEN Xiaohong2, RUAN Tong1. Survival prediction of game guild based on joint models for longitudinal and survival data [J]. Computer Engineering and Applications, 2018, 54(14): 264-270.

Multi-label Text Classification Based on Joint Model

基于联合模型的多标签文本分类研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 6

Recommended Articles

Metrics