Multi-Label Text Classification Based on DistilBERT and Label Correlation

doi:10.3778/j.issn.1002-8331.2308-0445

Abstract

Abstract: Existing multi-label text classification methods often ignore the relevance and semantic information of the labels, resulting in inadequate extraction of label features, and the correlation information between the labels is difficult to be effectively utilized. In order to solve this problem, a model IDLC is proposed that fuses DistilBERT and label correlation. Firstly, the word vector representations of the text and the labels are obtained by using DistilBERT and simultaneously DistilBERT is used to obtain global features containing text context information. Secondly, local features of the text are extracted by CNN, then the interrelationships between the labels are represented in the form of a graph, and then label features containing correlation information are captured by a graph attention network with a multi-attention mechanism.Thirdly, label features and text features are fused by a feature fusion method to construct a text representation with more feature information. Finally, this paper fuses labeled features and text features by feature fusion method to construct a text representation containing more feature information, so as to improve the classification accuracy of the model. The experimental results on the benchmark dataset show that compared with the benchmark model, this method can effectively improve the performance of the model and has better classification effects in the multi-label text classification task.

Key words: multi-label text classification, DistilBERT, graph attention network, convolutional neural network (CNN)

摘要： 现有的多标签文本分类方法往往忽视了标签的关联性和语义信息，导致标签特征提取不充分，标签之间的相关性信息难以得到有效的利用，为解决这一问题，提出一个融合DistilBERT和标签关联性信息的模型IDLC。使用DistilBERT获得文本的和标签的词向量表示，同时利用DistilBERT获取到包含文本上下文信息的全局特征，通过CNN提取文本局部特征，再使用图的形式来表示标签之间的相互关系，通过具有多头注意力机制的图注意力网络捕获包含关联信息的标签特征，最后通过特征融合方法将标签特征和文本特征进行融合，构建包含更多特征信息的文本表示，以此来提升模型的分类精度。在基准数据集上的实验结果表明，与基准模型相比，该方法能有效提升模型性能，在多标签文本分类任务中有更好的分类效果。

关键词: 多标签文本分类, DistilBERT, 图注意力网络, 卷积神经网络（CNN）

WANG Xuyang, GENG Liuqing, ZHANG Xin. Multi-Label Text Classification Based on DistilBERT and Label Correlation[J]. Computer Engineering and Applications, 2024, 60(23): 168-175.

王旭阳, 耿留青, 张鑫. 结合DistilBERT与标签关联性的多标签文本分类[J]. 计算机工程与应用, 2024, 60(23): 168-175.

References

[1] CHEN X, CHENG J, LIU J, et al. A survey of multi-label text classification based on deep learning[C]//Proceedings of the International Conference on Adaptive and Intelligent Systems, 2022: 443-456.
[2] LIN N, FU S, LIN X, et al. Multi-label emotion classification based on adversarial multi-task learning[J]. Information Processing & Management, 2022, 59(6): 103097.
[3] 李芳芳, 苏朴真, 段俊文, 等. 多粒度信息关系增强的多标签文本分类[J]. 软件学报, 2023, 34(12): 5686-5703.
LI F F, SU P A, DUAN J W, et al. Multi-label text classification with enhancing multi-granularity information relations[J]. Journal of Software, 2023, 34(12): 5686-5703.
[4] BOUTELL M R, LUO J, SHEN X, et al. Learning multi-label scene classification[J]. Pattern Recognition, 2004, 37(9): 1757-1771.
[5] READ J, PFAHRINGER B, HOLMES G, et al. Classifier chains for multi-label classification[J]. Machine Learning, 2011, 85(3): 333-359.
[6] TSOUMAKAS G, KATAKIS I. Multi-label classification: an overview[J]. International Journal of Data Warehousing and Mining, 2007, 3(3): 1-13.
[7] 郝超, 裘杭萍, 孙毅, 等. 多标签文本分类研究进展[J]. 计算机工程与应用, 2021, 57(10): 48-56.
HAO C，QIU H P，SUN Y，et al. Research progress of multi-label text classification[J]. Computer Engineering and Applications，2021，57(10): 48-56.
[8] CLARE A, KING R D. Knowledge discovery in multi-label phenotype data[C]//Proceedings of the European Conference on Principles of Data Mining and Knowledge Discovery, 2001: 42-53.
[9] ELISSEEFF A, WESTON J. A kernel method for multi-labelled classification[C]//Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, 2001: 681-687.
[10] ZHANG M L, ZHOU Z H. ML-KNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40(7): 2038-2048.
[11] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014: 1746-1751.
[12] CHEN G, YE D, XING Z, et al. Ensemble application of convolutional and recurrent neural networks for multi-label text categorization[C]//Proceedings of the 2017 International Joint Conference on Neural Networks, 2017: 2377-2383.
[13] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186.
[14] LIU Y，OTT M，GOYAL N，et al. RoBERTa: a robustly optimized BERT pretraining approach[J]. arXiv:1907.11692,2019.
[15] HE P C，LIU X D，GAO J F, et al. DeBERTa: decoding-enhanced Bert with disentangled attention[C]//Proceedings of the 9th International Conference on Learning Representations, 2021.
[16] PAL A, SELVAKUMAR M, SANKARASUBBU M. MAGNET: multi-label text classification using attention-based graph neural network[C]//Proceedings of the 12th International Conference on Agents and Artificial Intelligence, 2020: 494-505.
[17] ZHANG X, TANA X, LUO Z, et al. Multi-label sequence generating model via label semantic attention mechanism[J]. International Journal of Machine Learning and Cybernetics, 2023, 14(5): 1711-1723.
[18] LIU M, LIU L, CAO J, et al. Co-attention network with label embedding for text classification[J]. Neurocomputing, 2022, 471: 61-69.
[19] SANH V, DEBUT L, CHAUMOND J, et al. DistilBERT, a dis-tilled version of BERT: smaller, faster, cheaper and lighter[J]. arXiv:1910.01108, 2019.
[20] VU H T, NGUYEN M T, NGUYEN V C, et al. Label- representative graph convolutional network for multi-label text classification[J]. Applied Intelligence, 2022, 53: 14759-14774.
[21] TANG H, MI Y, XUE F, et al. Graph domain adversarial transfer network for cross-domain sentiment classification[J]. IEEE Access, 2021, 9: 33051-33060.
[22] YANG S, LIU Y. Short text classification method by fusing corpus features and graph attention network[J]. Journal of Computer Applications, 2022, 42(5): 1324.
[23] ZHU X, ZHU L, GUO J, et al. GL-GCN: global and local dependency guided graph convolutional networks for aspect-based sentiment classification[J]. Expert Systems with Applications, 2021, 186: 115712.
[24] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. arXiv:1706.03762, 2017.
[25] CHEN X, CONG P, LV S. A long-text classification method of Chinese news based on BERT and CNN[J]. IEEE Access, 2022, 10: 34046-34057.
[26] WU T, HUANG Q, LIU Z, et al. Distribution-balanced loss for multi-label classification in long-tailed datasets[C]//Proceedings of the European Conference on Computer Vision, 2020: 162-178.
[27] LIU N, WANG Q, REN J. Label-embedding bi-directional attentive model for multi-label text classification[J]. Neural Processing Letters, 2021, 53: 375-389.
[28] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[J]. arXiv:1409.3215, 2014.