Chinese Medical Named Entity Recognition Based on Multi-Layer Dynamic Fusion

doi:10.3778/j.issn.1002-8331.2305-0371

Abstract

Abstract: Aiming at the named entity recognition method based on the pre-training model, which only uses the hidden state of the last layer of the pre-training model, and ignores the problem that each Transformer layer corresponds to different text information, a multi-layer dynamic fusion method of the pre-training model is proposed. The pre-training model is used for feature extraction to obtain the hidden state sequence of each layer of the model. The hidden state information of each layer is combined through a multi-layer dynamic fusion method, which is used as the final output of the pre-training model. The conditional random field is used to process the sequence decode and complete sequence annotation. The multi-layer dynamic fusion method can make full use of the knowledge of each layer of the pre-trained model, so that the result contains rich feature information such as syntax and semantics, improves the representation ability of the model in the task, and enhances the flexibility of the model. Through experimental verification on medical text datasets CMeEE, CCKS2017 and general domain datasets Resume, Weibo, the results prove that adding multi-layer dynamic fusion method can effectively improve the effect of named entity recognition.

Key words: medical text mining, named entity recognition, pre-trained language model, multi-layer dynamic fusion

摘要： 针对基于预训练模型的命名实体识别方法仅使用了预训练模型最后一层隐状态，忽略了各Transformer层对应不同文本信息的问题，提出一种预训练模型多层动态融合方法。采用预训练模型进行特征提取，获得模型各层隐状态序列；通过多层动态融合方法对各层隐状态信息进行结合，作为预训练模型最终输出；采用条件随机场对序列进行解码，完成序列标注。多层动态融合方法可以充分利用预训练模型各层知识，使结果中包含丰富的句法、语义等特征信息，提升模型在任务中的表示能力，增强模型灵活性。通过对医疗文本数据集CMeEE、CCKS2017与通用领域数据集Resume、Weibo进行实验验证，结果证明，加入多层动态融合方法可以有效地提升命名实体识别效果。

关键词: 医疗文本挖掘, 命名实体识别, 预训练语言模型, 多层动态融合

LIN Lingde, LIU Na, XU Zhenshun, LI Ang, LI Chen. Chinese Medical Named Entity Recognition Based on Multi-Layer Dynamic Fusion[J]. Computer Engineering and Applications, 2024, 60(15): 161-169.

林令德, 刘纳, 徐贞顺, 李昂, 李晨. 基于多层动态融合的中文医疗命名实体识别[J]. 计算机工程与应用, 2024, 60(15): 161-169.

References

[1] 马欢欢, 孔繁之, 高建强. 中文电子病历命名实体识别方法研究[J]. 医学信息学杂志, 2020, 41(4): 24-29.
MA H H, KONG F Z, GAO J Q. Study on named entity recognition method of Chinese electronic medical records[J]. Journal of Medical Informatics, 2020, 41(4): 24-29.
[2] 付秀, 陈麒麟, 李杰, 等. 基于智能预问诊的全景多学科会诊平台的设计与应用[J]. 中国数字医学, 2021, 16(10): 79-82.
FU X, CHEN Q L, LI J, et al. Design and application of the panoramic multi-disciplinary treatment platform based on intelligent pre-consultation[J]. China Digital Medicine, 2021, 16(10): 79-82.
[3] SHANG J B, LIU L Y, GU X T, et al. Learning named entity tagger using domain-specific dictionary[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018: 2054-2064.
[4] 龚乐君, 张知菲. 基于领域词典与CRF双层标注的中文电子病历实体识别[J]. 工程科学学报, 2020, 42(4): 469-475.
GONG L J, ZHANG Z F. Clinical named entity recogniion from Chinese electronic medical records using a double-layer annotation model combining a domain dictionary with CRF[J]. Chinese Journal of Engineering, 2020, 42(4): 469-475.
[5] 高冰涛, 张阳, 刘斌. BioTrHMM: 基于迁移学习的生物医学命名实体识别算法[J]. 计算机应用研究, 2019, 36(1): 45-48.
GAO B T, ZHANG Y, LIU B. BioTrHMM: named entity recognition algorithm based on transfer learning in biomedical texts[J]. Application Research of Computers, 2019, 36(1): 45-48.
[6] RABINER L, JUANG B. An introduction to hidden Markov models[J]. IEEE ASSP Magazine, 1986, 3(1): 4-16.
[7] JAYNESE T. Information theory and statistical mechanics[J]. Physical Review, 1957, 106(4): 620-630.
[8] LAFFERTY J, MCCALLUM A, PEREIRA F C N. Conditional random fields: probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, USA, June 28-July 1, 2001: 282-289.
[9] KIM Y. Convolutional neural networks for sentence classification[EB/OL]. (2014-08-25)[2022-01-05]. https://arxiv.org/abs/1408.5882.
[10] HOCHREITER S, SCHMIDHUBER J. Long short-termmemory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[11] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems, 2017: 5998-6008.
[12] YIN M W, MOU C J, XIONG K N, et al. Chinese clinical named entity recognition with radical-level feature and self-attention mechanism[J]. Journal of Biomedical Informatics, 2019, 98: 103289.
[13] 赵珍珍, 董彦如, 刘静, 等. 融合词信息和图注意力的医学命名实体识别[J]. 计算机工程与应用, 2024, 60(11): 147-155.
ZHAO Z Z, DONG Y R, LIU J, et al. Medical named entity recognition incorporating word information and graph attention[J]. Computer Engineering and Applications, 2024, 60(11): 147-155.
[14] WEN S, ZENG B, LIAO W. Named entity recognition for instructions of Chinese medicine based on pre-trained language model[C]//2021 3rd International Conference on Natural Language Processing (ICNLP), 2021: 139-144.
[15] 张云秋, 汪洋, 李博诚. 基于RoBERTa-wwm动态融合模型的中文电子病历命名实体识别[J]. 数据分析与知识发现, 2022, 6(2/3): 242-250.
ZHANG Y Q, WANG Y, LI B C. Identifying named entities of Chinese electronic medical records based on RoBERTa-wwm dynamic fusion model[J]. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 242-250.
[16] LEE J, YOON W, KIM S, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining[J]. Bioinformatics, 2020, 36(4): 1234-1240.
[17] SYMEONIDOU A, SAZONAU V, GROTH P. Transfer learning for biomedicalnamed entity recognition with BioBERT[C]//SEMANTICS Posters & Demos, 2019: 1-5.
[18] CUI Y, CHE W, LIU T, et al. Pre-training with whole word masking for chinese bert[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 3504-3514.
[19] JAWAHAR G, SAGOT B, SEDDAH D. What does BERT learn about the structure of language?[C]//57th Annual Meeting of the Association for Computational Linguistics, 2019.
[20] SANG E F, BUCHHOLZ S. Introduction to the CoNLL-2000 shared task: Chunking[J]. arXiv:cs/0009008, 2000.
[21] CONNEAU A, KIELA D. Senteval: an evaluation toolkit for universal sentence representations[J]. arXiv:1803.05449, 2018.
[22] ALBILALI E, ALTWAIRESH N, HOSNY M. What does BERT learn from Arabic machine reading comprehension datasets?[C]//Proceedings of the Sixth Arabic Natural Language Processing Workshop, 2021: 32-41.
[23] ANTOUN W, BALY F, HAJJ H. Arabert: Transformer-based model for arabic language understanding[J]. arXiv:2003.00104, 2020.
[24] ZAN H Y, LI W X, ZHANG K L, et al. Building a pediatric medical corpus: word segmentation and named entity annotation[C]//21st Workshop on Chinese Lexical Semantics (CLSW 2020), Hong Kong, China, May 28-30, 2020. [S.l.]: Springer International Publishing, 2021: 652-664.
[25] ZHANG N, CHEN M, BI Z, et al. Cblue: a Chinese biomedical language understanding evaluation benchmark[J]. arXiv:2106.08087, 2021.
[26] ZHANG Y, YANG J. Chinese NER using lattice LSTM[J]. arXiv:1805.02023, 2018.
[27] PENG N, DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015: 548-554.
[28] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv:1810.04805, 2018.
[29] CUI Y, CHE W, LIU T, et al. Revisiting pre-trained models for Chinese natural language processing[J]. arXiv:2004.13922, 2020.
[30] SUN Y, WANG S, LI Y, et al. Ernie: enhanced representation through knowledge integration[J]. arXiv:1904.09223, 2019.