基于跨度解码的嵌套命名实体识别方法

doi:10.3778/j.issn.1002-8331.2208-0293

摘要/Abstract

摘要： 跨度分类是嵌套命名实体识别常用的方法，但由于需要穷举并验证每一个跨度，存在高复杂度和数据不平衡的问题。并且，由于对每个跨度是单独进行预测，忽视了文本序列中存在的实体间的依赖关系。针对跨度分类方法存在的上述问题，提出了一种基于跨度解码的嵌套命名实体识别方法。结合词性特征、字符特征、词特征以及上下文特征对文本进行编码，获取文本丰富的语义信息；识别可能的实体开始位置，在此基础上穷举可能的实体跨度，一定程度地减少潜在的实体跨度；使用基于注意力机制的解码器逐一对每个开始所对应的实体跨度的类型进行预测，解码过程中将已预测的实体信息进行传递，进而捕获和学习实体间的依赖关系。实验结果表明，跨度解码可以有效地改进跨度分类，所提出的方法在公共的英语嵌套实体数据集ACE2005和GENIA上的F1分数分别提高了0.45和0.14个百分点。

关键词: 嵌套命名实体识别, 跨度分类, 编解码, 神经网络

Abstract: Span classification is a popular method for nested named entity recognition but suffers from high complexity and data imbalance due to the need to exhaust and validate each span. Moreover, since the prediction is performed for each span individually, the dependencies among the entities present in the text sequence are ignored. To address the above problems of span classification methods, a nested named entity recognition method based on span decoding is proposed in the paper. First, the text is encoded by combining lexical features, character features, word features, and contextual features to obtain rich semantic information. Then, the possible entity start positions are identified, and the possible entity spans are exhausted on this basis to reduce the potential entity spans to some extent. Finally, the type of entity span corresponding to each start is predicted one by one using a decoder based on an attention mechanism. The decoding process passes the predicted entity information, and thus captures and learns the dependencies between entities. Experimental results show that span decoding can effectively improve span classification, and the proposed method improves F1 scores by 0.45 and 0.14?percentage points on the public English nested entity datasets ACE2005 and GENIA, respectively.

Key words: nested named entity recognition, span classification, encoder-decoder, neural networks

念永明, 陈艳平, 秦永彬, 黄瑞章. 基于跨度解码的嵌套命名实体识别方法[J]. 计算机工程与应用, 2024, 60(1): 174-181.

NIAN Yongming, CHEN Yanping, QIN Yongbin, HUANG Ruizhang. Nested Named Entity Recognition Method Based on Span Decoding[J]. Computer Engineering and Applications, 2024, 60(1): 174-181.

参考文献

[1] MIWA M, BANSAL M. End-to-end relation extraction using LSTMs on sequences and tree structures[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016.
[2] JAIN A, PARANJAPE B, LIPTON Z C. Entity projection via machine translation for cross-lingual NER[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019: 1083-1092.
[3] YIH W T, CHANG M W, HE X, et al. Semantic parsing via staged query graph generation: question answering with knowledge base[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2015.
[4] FINKEL J R, MANNING C D. Nested named entity recognition[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009: 141-150.
[5] CHEN Y, WU Y, QIN Y, et al. Recognizing nested named entity based on the neural network boundary assembling model[J]. IEEE Intelligent Systems, 2020, 35(1): 74-81.
[6] STRAKOVÁ J, STRAKA M, HAJIČ J. Neural architectures for nested NER through linearization[J]. arXiv:1908.06926, 2019.
[7] YAN H, GUI T, DAI J, et al. A unified generative framework for various NER subtasks[J]. arXiv:2106.01223, 2021.
[8] ALEX B, HADDOW B, GROVER C. Recognising nested named entities in biomedical text[C]//Biological, Translational, and Clinical Language Processing, 2007: 65-72.
[9] JU M, MIWA M, ANANIADOU S. A neural layered model for nested named entity recognition[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018: 1446-1459.
[10] SHIBUYA T, HOVY E. Nested named entity recognition via second-best sequence learning and decoding[J]. Transactions of the Association for Computational Linguistics, 2020, 8: 605-620.
[11] WANG J, SHOU L, CHEN K, et al. Pyramid: a layered model for nested named entity recognition[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 5918-5928.
[12] LU W, ROTH D. Joint mention extraction and classification with mention hypergraphs[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015: 857-867.
[13] MUIS A O, LU W. Labeling gaps between words: recognizing overlapping mentions with mention separators[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017: 2608-2618.
[14] KATIYAR A, CARDIE C. Nested named entity recognition revisited[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018.
[15] WANG B, LU W, WANG Y, et al. A neural transition-based model for nested mention recognition[J]. arXiv:1810.01808, 2018.
[16] XU M, JIANG H, WATCHARAWITTAYAKUL S. A local detection approach for named entity recognition and mention detection[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2017: 1237-1247.
[17] SOHRAB M G, MIWA M. Deep exhaustive model for nested named entity recognition[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018: 2843-2849.
[18] ZHENG C, CAI Y, XU J, et al. A boundary-aware neural model for nested named entity recognition[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019.
[19] EBERTS M, ULGES A. Span-based joint entity and relation extraction with transformer pre-training[J]. arXiv:1909. 07755, 2019.
[20] TAN C, QIU W, CHEN M, et al. Boundary enhanced neural span classification for nested named entity recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020: 9016-9023.
[21] WALKER C, STRASSEL S, MEDERO J, et al. ACE 2005 multilingual training corpus[J]. Progress of Theoretical Physics Supplement, 2006, 110(110): 261-276.
[22] KIM J D, OHTA T, TATEISI Y, et al. GENIA corpus—a semantically annotated corpus for bio-text mining[J]. Bioinformatics, 2003, 19(suppl_1): 180-182.
[23] MIKOLOV T, GRAVE E, BOJANOWSKI P, et al. Advances in pre-training distributed word representations[J]. arXiv:1712.09405, 2017.
[24] MANNING C D, SURDEANU M, BAUER J, et al. The Stanford CoreNLP natural language processing toolkit[C]//Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2014: 55-60.
[25] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv:1810.04805, 2018.
[26] LEE J, YOON W, KIM S, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining[J]. Bioinformatics, 2020, 36(4): 1234-1240.
[27] CHEN Y, ZHENG Q, CHEN P. A boundary assembling method for Chinese entity-mention recognition[J]. IEEE Intelligent Systems, 2015, 30(6): 50-58.