采用标签组合与融合注意力的多标签文本分类

doi:10.3778/j.issn.1002-8331.2110-0005

摘要/Abstract

摘要： 传统的多标签文本分类算法在挖掘标签的关联信息和提取文本与标签之间的判别信息过程中存在不足，由此提出一种基于标签组合的预训练模型与多粒度融合注意力的多标签文本分类算法。通过标签组合的预训练模型训练得到具有标签关联性的文本编码器，使用门控融合策略融合预训练语言模型和词向量得到词嵌入表示，送入预训练编码器中生成基于标签语义的文本表征。通过自注意力和多层空洞卷积增强的标签注意力分别得到全局信息和细粒度语义信息，自适应融合后输入到多层感知机进行多标签预测。在特定威胁识别数据集和两个通用多标签文本分类数据集上的实验结果表明，提出的方法在能够有效捕捉标签与文本之间的关联信息，并在F1值、汉明损失和召回率上均取得了明显提升。

关键词: 多标签文本分类, 融合注意力机制, 空洞卷积

Abstract: Traditional multi-label text classification algorithms are insufficient in the process of mining the associated information of labels and extracting the discriminative information between texts and labels. Therefore, a multi-label text classification algorithm based on pre-training model of label combination and multi-granularity fusion attention is proposed. Firstly, a text encoder with label relevance is obtained through the pre-training model training of the label combination, then a gated fusion strategy is used to fuse the pre-trained language model and the word vector to obtain word embedding representations, which are sent to the pre-training encoder to generate a text representation based on label semantics. Finally, global information and fine-grained semantic information are obtained by self-attention and label attention enhanced by multi-layer dilation convolution, which are adaptively fused and input to the multi-layer perceptron for multi-label prediction. Experimental results on the specific threat recognition dataset and the two general multi-label text classification datasets show that the proposed method can effectively capture the association information between labels and texts, and have achieved significant improvement in F1 value, Hamming loss and recall rate.

Key words: multi-label text classification, fusion attention mechanism, dilation convolution

邬鑫珂, 孙俊, 李志华. 采用标签组合与融合注意力的多标签文本分类[J]. 计算机工程与应用, 2023, 59(6): 125-133.

WU Xinke, SUN Jun, LI Zhihua. Multi-Label Text Classification Based on Label Combination and Fusion of Attentions[J]. Computer Engineering and Applications, 2023, 59(6): 125-133.

参考文献

[1] HAHN A，THOMAS R K，LOZANO I，et al.A multi-layered and kill-chain based security analysis framework for cyber-physical systems[J].International Journal of Critical Infrastructure Protection，2015，11：39-50.
[2] FRANKLIN L，PIRRUNG M，BLAHA L，et al.Toward a visualization-supported workflow for cyber alert management using threat models and human-centered design[C]//Visualization for Cyber Security，2017.
[3] STROM B E，APPLEBAUM A，MILLER D P，et al.Mitre att&ck：design and philosophy[J].Mitre Product Mp，2018：18-0944.
[4] HUSARI G，AL-SHAER E，AHMED M，et al.TTPDrill：automatic and accurate extraction of threat actions from unstructured text of CTI sources[C]//the 33rd Annual Computer Security Applications Conference，2017.
[5] 郝超，裘杭萍，孙毅，等.多标签文本分类研究进展[J].计算机工程与应用，2021，57（10）：48-56.
HAO C，QIU H P，SUN Y，et al.Research progress of multi-label text classification[J].Computer Engineering and Applications，2021，57（10）：48-56.
[6] GOPAL S，YANG Y.Multilabel classification with meta-level features[C]//Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval，2010：315-322.
[7] MYAGMAR B，LI J，KIMURA S.Cross-domain sentiment classification with bidirectional contextualized transformer language models[J].IEEE Access，2019，7：163219-163230.
[8] PAPANIKOLAOU Y，DIMITRIADIS D，TSOUMAKAS G，et al.Ensemble approaches for large-scale multi-label classification and question answering in biomedicine[C]//Conference and Labs of the Evaluation Forum，2014：1348-1360.
[9] GUO L，JIN B，YU R，et al.Multi-label classification methods for green computing and application for mobile medical recommendations[J].IEEE Access，2016，4：3201-3209.
[10] LEGOY V，CASELLI M，SEIFERT C，et al.Automated retrieval of ATT&CK tactics and techniques for cyber threat reports[J].arXiv：2004.14322，2020.
[11] DEVLIN J，CHANG M W，LEE K，et al.BERT：pretraining of deep bidirectional transformers for language understanding[C]//2019 Conference of the North American Chapter of the Association for Computational Linguistics：Human Language，2019：4171-4186.
[12] MIKOLOV T，CHEN K，CORRADO G，et al.Efficient estimation of word representations in vector space[J].arXiv：1301.3781，2013.
[13] BOUTELL M R，LUO J，SHEN X，et al.Learning multi-label scene classification[J].Pattern Recognition，2004，37（9）：1757-1771.
[14] TSOUMAKAS G，KATAKIS I.Multi-label classification：an overview[J].International Journal of Data Warehousing and Mining，2007，3（3）：1-13.
[15] ELISSEEFF A，WESTON J.A kernel method for multi-labelled classification[J].Advances in Neural Information Processing Systems，2001，14：681-687.
[16] CLARE A，KING R D.Knowledge discovery in multi-label phenotype data[C]//European Conference on Principles of Data Mining and Knowledge Discovery.Berlin，Heidelberg：Springer，2001：42-53.
[17] ZHANG M L，ZHOU Z H.ML-KNN：a lazy learning approach to multi-label learning[J].Pattern Recognition，2007，40（7）：2038-2048.
[18] KIM Y.Convolutional neural networks for sentences classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing，2014：1746-1751.
[19] CHEN G，YE D，XING Z，et al.Ensemble application of convolutional and recurrent neural networks for multi-label text categorization[C]//2017 International Joint Conference on Neural Networks，2017：2377-2383.
[20] YANG P，SUN X，LI W，et al.SGM：sequence generation model for multi-label classification[J].arXiv：1806.04822，2018.
[21] 王浩镔，胡平.采用多级特征的多标签长文本分类算法[J].计算机工程与应用，2021，57（15）：193-199.
WANG H B，HU P.Multi-label long text classification algorithm based on multi-level features[J].Computer Engineering and Applications，2021，57（15）：193-199.
[22] 刘心惠，陈文实，周爱，等.基于联合模型的多标签文本分类研究[J].计算机工程与应用，2020，56（14）：111-117.
LIU X H，CHEN W S，ZHOU A，et al.Multi-label text classification based on joint model[J].Computer Engineering and Applications，2020，56（14）：111-117.
[23] XIAO L，HUANG X，CHEN B，et al.Label-specific document representation for multi-label text classification[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing，2019：466-475.
[24] PENNINGTON J，SOCHER R，MANNING C D.Glove：global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing，2014：1532-1543.
[25] LEWIS D D，YANG Y，ROSE T G，et al.RCV1：a new benchmark collection fortext categorization research[J].Journal of Machine Learning Research，2004，5：361-397.
[26] READ J，PFAHRINGER B，HOLMES G，et al.Classifier chainsfor multi-label classification[J].Machine Learning，2011，85（3）：333-359.
[27] YANG P，LUO F，MA S，et al.A deep reinforced sequence-to-set model for multi-label text classification[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics，2019：5252-5258.
[28] WANG G，LI C，WANG W，et al.Joint embedding of words and labels for text classification[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics（Volume 1：Long Papers），2018：2321-2331.
[29] TSAI C P，LEE H Y.Order-free learning alleviating exposure bias in multi-label classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2020：6038-6045.
[30] XIAO L，ZHANG X，JING L，et al.Does head label help for long-tailed multi-label text classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2021：14103-14111.
[31] WANG R，RIDLEY R，SU X，et al.A novel reasoning mechanism for multi-label text classification[J].Information Processing and Management，2021，58（2）：102441.