Named Entity Recognition in Chinese Electronic Medical Records Using Transformer-CRF

doi:10.3778/j.issn.1002-8331.1909-0211

Abstract

Abstract:

Named entity recognition is one of the basic tasks of natural language processing. Aiming at the problem that the traditional model of Chinese EMR named entity recognition is not effective, a neural network model based on attention mechanism is proposed. Firstly, the experiment uses self-built real Chinese electronic medical record data sets and preprocesses the data sets by manual labeling and word segmentation. Secondly, it trains optimization of Transformer model to extract text features. Finally, it uses conditional random fields to classify and recognize the extracted text features. To verify the effectiveness of the proposed method, the Transformer-CRF neural network model is compared with seven other traditional models. The recognition performance of the model is evaluated by three indicators： precision, recall and F1 value. The experimental results show that in the same corpus, the transformer-CRF model has a better recognition effect on the named entity of Body parts, and the F1 value is as high as 95.02%, and compared with the other seven traditional models, the precision, recall and F1 value of the transformer-CRF model are higher, which proves that the model has a better recognition performance in a certain degree.

Key words: Electronic Medical Records（EMR）, named entity recognition, Transformer, Conditional Random Fields（CRF）

摘要：

命名实体识别是自然语言处理的基本任务之一。针对中文电子病历命名实体识别传统模型识别效果不佳的问题，提出一种完全基于注意力机制的神经网络模型。实验采用自建真实中文电子病历数据集并对数据集进行人工标注、分词等预处理；对Transformer模型进行训练优化，以提取文本特征；利用条件随机场对提取到的文本特征进行分类识别。为验证所提方法的有效性，将构建的Transformer-CRF神经网络模型与其他7种传统模型进行比较研究，实验采用精确率、召回率和[F1]值三个指标评估模型的识别性能。实验结果显示，在同一语料集下，Transformer-CRF模型对身体部位类的命名实体识别效果较好，[F1]值高达95.02%；且与其他7种传统模型相比，Transformer-CRF模型的精确率、召回率和[F1]值均较高，在一定程度上验证了所构建模型具有较好的识别性能。

关键词: 电子病历（EMR）, 命名实体识别, Transformer, 条件随机场（CRF）

LI Bo, KANG Xiaodong, ZHANG Huali, WANG Yage, CHEN Yayuan, BAI Fang. Named Entity Recognition in Chinese Electronic Medical Records Using Transformer-CRF[J]. Computer Engineering and Applications, 2020, 56(5): 153-159.

李博，康晓东，张华丽，王亚鸽，陈亚媛，白放. 采用Transformer-CRF的中文电子病历命名实体识别[J]. 计算机工程与应用, 2020, 56(5): 153-159.

[1]	YANG Qian, GU Lei. Chinese Named Entity Recognition Based on Denoising Joint Character-Word Model [J]. Computer Engineering and Applications, 2021, 57(7): 151-157.
[2]	HUANG Meigen, LIU Jiale, LIU Chuan. Research on Improved BERT’s Chinese Multi-relation Extraction Method [J]. Computer Engineering and Applications, 2021, 57(21): 234-240.
[3]	WEI Hao, ZHOU Ai, ZHANG Yijia, CHEN Fei, QU Wen, LU Mingyu. Review of Deep Learning-Based Biomedical Entity Relation Extraction Research [J]. Computer Engineering and Applications, 2021, 57(21): 14-23.
[4]	YAO Guibin, ZHANG Qigui. Chinese Named Entity Recognition Based on XLnet Language Model [J]. Computer Engineering and Applications, 2021, 57(18): 156-162.
[5]	LI Tiefei, SHENG Long, WU Di. Study on Text Classification Method of BERT-TECNN Model [J]. Computer Engineering and Applications, 2021, 57(18): 186-193.
[6]	JIAO Kainan, LI Xin, ZHU Rongchen. Overview of Chinese Domain Named Entity Recognition [J]. Computer Engineering and Applications, 2021, 57(16): 1-15.
[7]	QIAO Weitao, HUANG Haiyan, WANG Shan. Semantic Similarity Calculation Based on Transformer Encoder [J]. Computer Engineering and Applications, 2021, 57(14): 158-163.
[8]	HU Renyuan, LIU Jianhua, BU Guannan, ZHANG Dongyang, LUO Yixuan. Research on Sentiment Analysis of Multi-level Semantic Collaboration Model Fused with BERT [J]. Computer Engineering and Applications, 2021, 57(13): 176-184.
[9]	HE Yujie, DU Fang, SHI Yingjie, SONG Lijuan. Survey of Named Entity Recognition Based on Deep Learning [J]. Computer Engineering and Applications, 2021, 57(11): 21-36.
[10]	TANG Dengping, CAI Wenjia, ZOU Li, HU Xiang, DING Li, WANG Xue. Application of Improved Convolution Neural Network in Fault Diagnosis of Transformer [J]. Computer Engineering and Applications, 2021, 57(11): 239-247.
[11]	SUN Linghao. Cross-Lingual Chinese Named Entity Recognition Based on Translation Model [J]. Computer Engineering and Applications, 2021, 57(10): 94-100.
[12]	LIU Xiaoan, PENG Tao. Research on Chinese Scenic Spot Named Entity Recognition Based on Convolutional Neural Network [J]. Computer Engineering and Applications, 2020, 56(4): 140-145.
[13]	TANG Zhuang, WANG Zhishu, ZHOU Ai, FENG Meishan, QU Wen, LU Mingyu. Transformer-Capsule Integrated Model for Text Classification [J]. Computer Engineering and Applications, 2020, 56(24): 151-156.
[14]	WANG Kun, LIN Min, LI Yanling. Review of Research on Joint Intent Detection and Semantic Slot Filling in End to End Dialogue System [J]. Computer Engineering and Applications, 2020, 56(14): 14-25.
[15]	ZHANG Zhen, SU Yila, NIU Xianghua, GAO Fen, ZHAO Yaping, Ren Qing Daoer Ji. Domain Information Sharing Method in Mongolian-Chinese Machine Translation Application [J]. Computer Engineering and Applications, 2020, 56(10): 106-114.

Named Entity Recognition in Chinese Electronic Medical Records Using Transformer-CRF

采用Transformer-CRF的中文电子病历命名实体识别

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics