Multi-Label Classification of Options Based on Seq2seq Model of Hybrid Attention

doi:10.3778/j.issn.1002-8331.2108-0352

Abstract

Abstract: The multi-label classification of options is an important part of the task of multiple-choice questions for reading comprehension of literature in college entrance examination （RCL-CEE）. It can effectively improve the accuracy of multiple-choice questions by invoking different answering engines for different types of options. Option classification is regarded as a multi-label learning task since an option may have multiple characteristics for the complexity and variety of options. Traditional multi-label classification only considers the correlation between text and label, ignores the correlation between labels, and there exists strong semantic relevance within one option, which has great impact on label prediction. In order to handle these challenges, a hybrid attention based Seq2seq model is proposed, which considers the correlations from the option to the label and internal correlation of an option. Bi-LSTM is used to obtain the mutual information from the option to the label, and the multi-head self-attention is used to obtain the correlations semantics within one option. The label embedding is used to implicitly fuse semantic correlation between labels. Experimental results on the dataset of multiple-choice questions for RCL-CEE show that modeling above correlations can effectively improve the accuracy of options multi-label classification.

Key words: reading comprehension, multi-label text classification, self-attention, option correlation

摘要： 选项多标签分类是高考文学类阅读理解选择题解答任务中的重要一环，对不同标签类型的选项调用不同的答题引擎，可以有效提高选择题答题准确率。由于选项类型复杂多样，一个选项可能有多个类别特征，将其看作多标签分类任务。传统多标签分类算法仅考虑到文本与标签间相关性，忽略了标签间相关性，且选项内部存在着强语义关联性，对最终的标签预测产生较大影响。为了充分利用选项内相关性，提出一种基于混合注意力的Seq2seq模型，同时考虑选项标签间相关性和选项内相关性。采用Bi-LSTM获得选项到标签的相互信息，利用多头自注意力获得选项内关联语义。为获取标签间语义相关性，使用标签嵌入方式进行隐式融合。在高考文学类阅读理解选择题数据集上的实验结果表明，对多种相关性建模能有效提升选项多标签分类精度。

关键词: 阅读理解, 多标签文本分类, 自注意力, 选项相关性

CHEN Qian, HAN Lin, WANG Suge, GUO Xin. Multi-Label Classification of Options Based on Seq2seq Model of Hybrid Attention[J]. Computer Engineering and Applications, 2023, 59(4): 104-111.

陈千, 韩林, 王素格, 郭鑫. 基于混合注意力Seq2seq模型的选项多标签分类[J]. 计算机工程与应用, 2023, 59(4): 104-111.

References

[1] ZHANG M L，ZHOU Z H.A review on multi-label learning algorithms[J].IEEE Transactions on Knowledge and Data Engineering，2014，26（8）：1819-1837.
[2] TSOUMAKAS G，KATAKIS I.Multi-label classification[J].International Journal of Data Warehousing and Mining，2007，3（3）：1-13.
[3] SCHAPIRE R E，SINGER Y.BoosTexter：a boosting-based system for text categorization[J].Machine Learning，2000，39（2/3）：135-168.
[4] GOPAL S，YANG Y M.Multilabel classification with meta-level features[C]//Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval，2010：315-322.
[5] BOUTELL M R，LUO J B，SHEN X P，et al.Learning multi-label scene classification[J].Pattern Recognition，2004，37（9）：1757-1771.
[6] READ J，PFAHRINGER B，HOLMES G，et al.Classifier chains for multi-label classification[J].Machine Learning，2011，85（3）：333-359.
[7] KURATA G，XIANG B，ZHOU B W.Improved neural network-based multi-label classification with better initialization leveraging label co-occurrence[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics：Human Language Technologies.Stroudsburg：ACL，2016：521-526.
[8] YEH C K，WU W C，KO W J，et al.Learning deep latent space for multi-label classification[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence，2017：2838-2844.
[9] NAM J，MENCíA E L，KIM H J，et al.Maximizing subset accuracy with recurrent neural networks in multi-label classification[C]//Proceedings of the Advances Neural Information Processing Systems，2017：5413-5423.
[10] HOCHREITER S，SCHMIDHUBER J.Long short-term memory[J].Neural Computation，1997，9（8）：1735-1780.
[11] YANG P C，SUN X，LI W，et al.SGM：sequence generation model for multi-label classification[C]//Proceedings of the 27th International Conference on Computational Linguistics，2018：3915-3926.
[12] CLARE A，KING R D.Knowledge discovery in multi-label phenotype data[C]//Proceedings of European Conference on Principles of Data Mining and Knowledge Discovery，2001：42-53.
[13] FüRNKRANZ J，HüLLERMEIER E，LOZA?MENCíA E，et al.Multilabel classification via calibrated label ranking[J].Machine Learning，2008，73（2）：133-153.
[14] LI C，WANG B Y，PAVLU V，et al.Conditional Bernoulli mixtures for multi-label classification[C]//Proceedings of the 33rd International Conference on Machine Learning，2016：2482-2491.
[15] LIU H，YUAN C X，WANG X J.Label-wise document pre-training for multi-label text classification[C]//Proceedings of the 9th CCF International Conference on Natural Language Processing and Chinese Computing，2020：641-653.
[16] AZARBONYAD H，DEHGHANI M，MARX M，et al.Learning to rank for multi-label text classification：combining different sources of information[J].Natural Language Engineering，2021，27（1）：89-111.
[17] WANG T，LIU L，LIU N，et al.A multi-label text classification method via dynamic semantic representation model and deep neural network[J].Applied Intelligence，2020，50（8）：969-978.
[18] LIN J Y，SU Q，YANG P C，et al.Semantic-unit-based dilated convolution for multi-label text classification[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing，2018：4554-4564.
[19] VASWANI A，SHAZEER N，PARMAR N.Attention is all you need[C]//Proceeding of the 2017 Conference on Neural Information Processing System，2017：5998-6008.
[20] WISEMAN S，RUSH A M.Sequence-to-sequence learning as beam-search optimization[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing，2016：1296-1306.
[21] NAM J，KIM J.Large-scale multi-label text classification revisiting neural networks[C]//Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases，2014：437-452.
[22] SCHAPIRE R E，SINGER Y.Improved boosting algorithms using confidence-rated predictions[C]//Proceedings of the 11th Conference on Computational Learning Theory，1998：297-336.
[23] LARSON R R.Introduction to information retrieval[J].Journal of the Association for Information Science & Technology，2010，61（4）：852-853.
[24] KINGMA D P，BA J.Adam：a method for stochastic optimization[C]//Proceedings of the 3rd International Conference on Learning Representations，2015.
[25] KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing，2014：1746-1751.
[26] CHEN G B，YE D H，XING Z C，et al.Ensemble application of convolutional and recurrent neural networks for multi-label text categorization[C]//Proceedings of the 2017 International Joint Conference on Neural Networks，2017：2377-2383.
[27] XIAO L，HUANG X，CHEN B L，et al.Label-specific document representation for multi-label text classification[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing，2019：466-475.
[28] YANG P C，LUO F L，MA S M，et al.A deep reinforced sequence-to-set model for multi-label classification[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics，2019：5252-5258.
[29] HOU Y T，LAI Y K，WU Y S， et al.Few-shot learning for multi-label intent detection[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence，2021.