计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (17): 35-47.DOI: 10.3778/j.issn.1002-8331.2212-0285
易钧汇,查青林
出版日期:
2023-09-01
发布日期:
2023-09-01
YI Junhui, ZHA Qinglin
Online:
2023-09-01
Published:
2023-09-01
摘要: 随着我国医疗信息化水平的提高以及电子病历普及,医疗数据量呈爆炸式增长的趋势。中医电子病历是中医医案的结构化存储形式,其中包含大量的中医临床经验。症状是医家诊病、辨证的主要依据。对中医电子病历进行症状信息抽取可以得到文本中的关键信息,对提高中医诊疗效率和后续病症、药症关系等研究分析工作提供帮助。简要介绍中医症状信息抽取流程;分别阐述中医症状命名实体识别和关系抽取的难点、评价标准和近年的研究成果,并对这些研究成果所采用的方法进行了对比分析;总结症状信息抽取的下游应用,给出症状信息抽取任务中问题的解决思路。
易钧汇, 查青林. 中医症状信息抽取研究综述[J]. 计算机工程与应用, 2023, 59(17): 35-47.
YI Junhui, ZHA Qinglin. Survey of TCM Symptom Information Extraction[J]. Computer Engineering and Applications, 2023, 59(17): 35-47.
[1] 高殿璞,王映辉,张润顺,等.中医医案规范化研究述评[J].中国中医药信息杂志,2018,25(5):131-135. GAO D P,WNG Y H,ZHANG R S,et al.Commentary on standardization of TCM records[J].Chinese Journal of Information on Traditional Chinese Medicine,2018,25(5):131-135. [2] 吴智妍,金卫,岳路,等.电子病历命名实体识别技术研究综述[J].计算机工程与应用,2022,58(21):13-29. WU Z Y,JIN W,YUE L,et al.Review of research on named entity recognition technologies for electronic medical records[J].Computer Engineering and Applications,2022,58(21):13-29. [3] 孔静静,于琦,李敬华,等.实体抽取综述及其在中医药领域的应用[J].世界科学技术-中医药现代化,2022,24(8):2957-2963. KONG J J,YU Q,LI J H,et al.Summary of entity extraction and its application in the field of traditional Chinese medicine[J].Modernization of Traditional Chinese Medicine and Materia Medica-World Science and Technology,2022,24(8):2957-2963. [4] XU J,XI X,CHEN J,et al.A survey of deep learning for electronic health records[J].Applied Sciences,2022,12(22):11709. [5] LANDOLSI M Y,HLAOUA L,BEN ROMDHANE L.Information extraction from electronic medical documents:state of the art and future research directions[J].Knowledge and Information Systems,2022:1-54. [6] WAN H,MOENS M F,LUYTEN W,et al.Extracting relations from traditional Chinese medicine literature via heterogeneous entity networks[J].Journal of the American Medical Informatics Association,2016,23(2):356-365. [7] NADEAU D,SEKINE S.A survey of named entity recognition and classification[J].Lingvisticae Investigationes,2007,30(1): 3-26. [8] 曲春燕,关毅,杨锦锋,等.中文电子病历命名实体标注语料库构建[J].高技术通讯,2015,25(2):143-150. QU C Y,GUAN Y,YANG J F,et al.Construction of Chinese electronic medical record named entity label corpus[J].Chinese High Technology Letters,2015,25(2):143-150. [9] 杨锦锋,关毅,何彬,等.中文电子病历命名实体和实体关系语料库构建[J].软件学报,2016,27(11):2725-2746. YANG J F,GUAN Y,HE B,et al.Corpus construction for named entities and entity relations on Chinese electronic medical records[J].Journal of Software,2016,27(11):2725-2746. [10] FORNEY G D.The viterbi algorithm[J].Proceedings of the IEEE,1973,61(3):268-278. [11] MAO X,LI F,WANG H,et al.Named entity recognition of electronic medical record based on improved HMM algorithm[C]//2017 International Conference on Computer Technology,Electronics and Communication(ICCTEC),2017:435-438. [12] JAYNES E T.INFormation theory and statistical mechanics[J].Physical Review,1957,106(4):620. [13] LEI J,TANG B,LU X,et al.A comprehensive study of named entity recognition in Chinese clinical text[J]. Journal of the American Medical Informatics Association,2014,21(5): 808-814. [14] CORTES C,VAPNIK V.Support-vector networks[J].Machine Learning,1995,20(3):273-297. [15] 王浩畅,赵铁军.基于SVM的生物医学命名实体的识别[J].哈尔滨工程大学学报,2006,27(z1):570-574. WANG H C,ZHAO T J.SVM-based biomedical name entity recognition[J].Journal of Harbin Engineering University,2006,27(z1):570-574. [16] LAFFERTY J,MCCALLUM A,PEREIRA F.Conditional random fields:probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the Eighteenth International Conference on Machine Learning,2001:282-289. [17] 王世昆,李绍滋,陈彤生. 基于条件随机场的中医命名实体识别[J].厦门大学学报(自然科学版),2009,48(3): 359-364. WANG S K,LI S Z,CHEN T S.Recognition of Chinese medicine named entity based on condition random field[J].Journal of Xiamen University(Natural Science),2009,48(3):359-364. [18] JIANG M,CHEN Y,LIU M,et al.A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries[J].Journal of the American Medical Informatics Association,2011,18(5):601-606. [19] JONNALAGADDA S,COHEN T,WU S,et al.Enhancing clinical concept extraction with distributional semantics[J].Journal of Biomedical Informatics,2012,45(1):129-140. [20] DONG X,QIAN L,GUAN Y,et al.A multiclass classification method based on deep learning for named entity recognition in electronic medical records[C]//2016 New York Scientific Data Summit(NYSDS),2016:1-10. [21] KONG J,ZHANG L,JIANG M,et al.Incorporating multi-level CNN and attention mechanism for Chinese clinical named entity recognition[J].Journal of Biomedical Informatics,2021,116:103737. [22] 李明浩,刘忠,姚远哲.基于LSTM-CRF的中医医案症状术语识别[J].计算机应用,2018,38(S2):42-46. LI M H,LIU Z,YAO Y Z. LSTM-CRF based symptom term recognition on traditional Chinese medical case[J].Journal of Computer Applications,2018,38(S2):42-46. [23] BATBAATAR E,RYU K H.Ontology-based healthcare named entity recognition from twitter messages using a recurrent neural network approach[J].International Journal of Environmental Research and Public Health,2019,16(19):3628. [24] QIN Y,ZENG Y.Research of clinical named entity recognition based on Bi-LSTM-CRF[J].Journal of Shanghai Jiaotong University(Science),2018,23(3):392-397. [25] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems,2017. [26] DEVLIN J,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [27] LI X,ZHANG H,ZHOU X H.Chinese clinical named entity recognition with variant neural structures based on BERT methods[J].Journal of Biomedical Informatics,2020,107:103422. [28] CHAI Z,JIN H,SHI S,et al.Hierarchical shared transfer learning for biomedical named entity recognition[J]. BMC Bioinformatics,2022,23(1):1-14. [29] LI J,LIU R,CHEN C,et al.An RG-FLAT-CRF model for named entity recognition of Chinese electronic clinical records[J].Electronics,2022,11(8):1282. [30] WANG R,YU T,ZHAO H,et al.Few-shot class-incremental learning for named entity recognition[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers),2022:571-582. [31] LEE J,YOON W,KIM S,et al.BioBERT:a pre-trained biomedical language representation model for biomedical text mining[J].Bioinformatics,2020,36(4):1234-1240. [32] 鄂海红,张文静,肖思琪,等.深度学习实体关系抽取研究综述[J].软件学报,2019,30(6):1793-1818. E H H,ZHANG W J,XIAO S Q,et al.Survey of entity relationship extraction based on deep learning[J].Journal of Software,2019,30(6):1793-1818. [33] BEN ABACHA A,ZWEIGENBAUM P.Automatic extraction of semantic relations between medical entities:a rule based approach[J].Journal of Biomedical Semantics,2011,2(5):1-11. [34] 江爽.基于依存句法分析的皮肤病实体关系抽取[D].烟台:鲁东大学,2019. JIANG S.Entity relation extraction of dermatosis based on dependency syntax analysis[D].Yantai:Ludong University,2019. [35] MUZAFFAR A W,AZAM F,QAMAR U.A relation extraction framework for biomedical text using hybrid feature set[J].Computational and Mathematical Methods in Medicine,2015,2015:910423. [36] ROBERTS A,GAIZAUSKAS R,HEPPLE M,et al.Mining clinical relationships from patient narratives[J].BMC Bioinformatics,2008,9(11):1-17. [37] GIULIANO C,LAVELLI A,ROMANO L.Exploiting shallow linguistic information for relation extraction from biomedical literature[C]//11th Conference of the European Chapter of the Association for Computational Linguistics(EACL 2006),2006:401-408. [38] YANG C,XIAO D,LUO Y Y,et al.A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs[J].BMC Medical Informatics and Decision Making,2022,22(1):169. [39] 刘凯,符海东,邹玉薇,等.基于卷积神经网络的中文医疗弱监督关系抽取[J].计算机科学,2017,44(10):249-253. LIU K,FU H D,ZOU Y W,et al.Chinese medical weak supervised relation extraction based on convolution neural network[J].Computer Science,2017,44(10):249-253. [40] HASEGAWA T,SEKINE S,GRISHMAN R.Discovering relations among named entities from large corpora[C]//Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics(ACL-04),2004: 415-422. [41] QUAN C,WANG M,REN F.An unsupervised text mining method for relation extraction from biomedical literature[J].PloS One,2014,9(7):e102039. [42] TIAN B,XING C.Deep learning based temporal information extraction framework on Chinese electronic health records[C]//International Conference on Web Information Systems and Applications,2018:203-214. [43] SUáREZ-PANIAGUA V,ZAVALA R M R,SEGURA-BEDMAR I,et al.A two-stage deep learning approach for extracting entities and relationships from medical texts[J].Journal of Biomedical Informatics,2019,99: 103285. [44] MAGGE A,SCOTCH M,GONZALEZ-HERNANDEZ G.Clinical NER and relation extraction using bi-char-LSTMs and random forest classifiers[C]//International Workshop on Medication and Adverse Drug Event Detection,2018:25-30. [45] XUE K,ZHOU Y,MA Z,et al.Fine-tuning BERT for joint entity and relation extraction in Chinese medical text[C]//2019 IEEE International Conference on Bioinformatics and Biomedicine(BIBM),2019:892-897. [46] 张玉坤,刘茂福,胡慧君.基于联合神经网络模型的中文医疗实体分类与关系抽取[J].计算机工程与科学,2019,41(6):1110-1118. ZHANG Y K,LIU M F,HU H J. Chinese medical entity classification and relationship extraction based on joint neural network model[J].Computer Engineering & Science,2019,41(6):1110-1118. [47] NAYAK T,NG H T.Effective modeling of encoder-decoder architecture for joint entity and relation extraction[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020:8528-8535. [48] MINTZ M,BILLS S,SNOW R,et al.Distant supervision for relation extraction without labeled data[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP,2009:1003-1011. [49] ZHAO Q,XU D,LI J,et al.Knowledge guided distance supervision for biomedical relation extraction in Chinese electronic medical records[J].Expert Systems with Applications,2022:117606. [50] LIN Y,LI Y,LU K,et al.Long-distance disorder-disorder relation extraction with bootstrapped noisy data[J].Journal of Biomedical Informatics,2020,109:103529. [51] 杨一帆,施淼元,缪庆亮,等.基于远程监督的病历文本漏标问题研究[J].中文信息学报,2022,36(8):73-80. YANG Y F,SHI M Y,MIAO Q L,et al.Conquering unlabeled entity in medical record text under distant supervision framework[J].Journal of Chinese Information Processing,2022,36(8):73-80. [52] ZHANG X,YANG A,LI S,et al.Machine reading comprehension:a literature review[J].arXiv:1907.01686,2019. [53] CHEN J,HU B,PENG W,et al.Biomedical relation extraction via knowledge-enhanced reading comprehension[J].BMC Bioinformatics,2022,23(1):1-19. [54] SUN C,YANG Z,WANG L,et al.MRC4BioER:joint extraction of biomedical entities and relations in the machine reading comprehension framework[J].Journal of Biomedical Informatics,2022,125:103956. [55] CHEN Z,GUO C.A pattern-first pipeline approach for entity and relation extraction[J].Neurocomputing,2022,494:182-191. [56] 王松,李正钧,杨涛,等.中医药知识图谱研究现状及发展趋势[J].南京中医药大学学报,2022,38(3):272-278. WANG S,LI Z J,YANG T,et al.Current status and development trend of knowledge graph research in traditional Chinese medicine[J].Journal of Nanjing University of Traditional Chinese Medicine,2022,38(3): 272-278. [57] WU Y,ZHU X,ZHU Y.An improved approach to the construction of Chinese medical knowledge graph based on CTD-BLSTM model[J].IEEE Access,2021,9:74969-74976. [58] ZHANG K,HU C,SONG Y,et al.Construction of Chinese obstetrics knowledge graph based on the multiple sources data[C]//Chinese Lexical Semantics:22nd Workshop,2022:399-410. [59] 孔鸣,何前锋,李兰娟.人工智能辅助诊疗发展现状与战略研究[J].中国工程科学,2018,20(2):86-91. KONG M,HE Q F,LI L J. AI assisted clinical diagnosis & treatment,and development strategy[J].Strategic Study of CAE,2018,20(2):86-91. [60] LIU Z,XIAO L,CHEN J,et al.An emotion-fused medical knowledge graph and its application in decision support[C]//2022 IEEE 46th Annual Computers,Software,and Applications Conference,2022:1381-1388. [61] ZHANG D,JIA Q,YANG S,et al.Traditional Chinese medicine automated diagnosis based on knowledge graph reasoning[J].Computers,Materials & Continua,2022,71(1):159-170. [62] LI X,LIU H,ZHAO X,et al.Automatic approach for constructing a knowledge graph of knee osteoarthritis in Chinese[J].Health Information Science and Systems,2020,8(1):1-8. [63] 范媛媛,李忠民.中文医学知识图谱研究及应用进展[J].计算机科学与探索,2022,16(10):2219-2233. FAN Y Y,LI Z M.Research and application progress of Chinese medical knowledge graph[J].Journal of Frontiers of Computer Science and Technology,2022,16(10):2219-2233. [64] ZOU Y,HE Y,LIU Y.Research and implementation of intelligent question answering system based on knowledge Graph of traditional Chinese medicine[C]//2020 39th Chinese Control Conference(CCC),2020:4266-4272. [65] JIANG Z,CHI C,ZHAN Y.Research on medical question answering system based on knowledge graph[J].IEEE Access,2021,9:21094-21101. [66] GéRARDIN C,WAJSBüRT P,VAILLANT P,et al. Multilabel classification of medical concepts for patient clinical profile identification[J].Artificial Intelligence in Medicine,2022,128:102311. [67] TANG W,WANG J,LIN H,et al.A syntactic information-based classification model for medical literature:algorithm development and validation study[J].JMIR Medical Informatics,2022,10(8):e37817. [68] 游新冬,葛昊杰,韩君妹,等.面向武器装备领域的复杂实体识别[J].北京大学学报(自然科学版),2022,58(3):391-404. YOU X D,GE H J,HAN J M,et al.Recognition of complex entities in weapons and equipment field[J]. Acta Scientiarum Naturalium Universitatis Pekinenis,2022,58(3):391-404. [69] XU H,LIU H,JIA Q,et al.A nested named entity recognition method for traditional Chinese medicine records[C]//International Conference on Artificial Intelligence and Security,2021:488-497. [70] LI X,FENG J,MENG Y,et al.A unified MRC framework for named entity recognition[J].arXiv:1910.11476,2019. [71] SU J,MURTADHA A,PAN S,et al.Global pointer: novel efficient span-based approach for named entity recognition[J].arXiv:2208.03054,2022. [72] YU J,BOHNET B,POESIO M.Named entity recognition as dependency parsing[J].arXiv:2005.07150,2020. [73] 冯钧,张涛,杭婷婷.重叠实体关系抽取综述[J].计算机工程与应用,2022,58(1):1-11. FENG J,ZHANG T,HANG T T.Survey of overlapping entities and relations extraction[J].Computer Engineering and Applications,2022,58(1):1-11. [74] TAKANOBU R,ZHANG T Y,LIU J X,et al.A hierarchical framework for relation extraction with reinforcement learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2019. [75] FU T J,LI P H,MA W Y.Graphrel:modeling text as relational graphs for joint entity and relation extraction[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,2019:1409-1418. [76] 徐春,李胜楠.融合BERT-WWM和指针网络的旅游知识图谱构建研究[J].计算机工程与应用,2022,58(12):280-288. XU C,LI S N.Research on construction of tourism knowledge graph integrating BERT-WWM and pointer network[J].Computer Engineering and Applications,2022,58(12):280-288. [77] ZHOU J T,ZHANG H,JIN D,et al.Dual adversarial neural transfer for low-resource named entity recognition[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,2019:3461-3471. |
[1] | 郑肇谦, 韩东辰, 赵辉. 单步片段标注的实体关系联合抽取模型[J]. 计算机工程与应用, 2023, 59(9): 130-139. |
[2] | 赵萍, 窦全胜, 唐焕玲, 姜平, 陈淑振. 融合词信息嵌入的注意力自适应命名实体识别[J]. 计算机工程与应用, 2023, 59(8): 167-174. |
[3] | 贾阵, 丁泽华, 陈艳平, 黄瑞章, 秦永彬. 面向司法数据的事件抽取方法研究[J]. 计算机工程与应用, 2023, 59(6): 277-282. |
[4] | 肖立中, 臧中兴, 宋赛赛. 融合自注意力的关系抽取级联标记框架研究[J]. 计算机工程与应用, 2023, 59(3): 77-83. |
[5] | 郭鑫, 高彩翔, 陈千, 王素格, 王雪婧. 面向新冠新闻的三阶段篇章级事件抽取方法[J]. 计算机工程与应用, 2023, 59(3): 150-157. |
[6] | 米健霞, 谢红薇. 面向招标物料的命名实体识别研究及应用[J]. 计算机工程与应用, 2023, 59(2): 314-320. |
[7] | 胡杭乐, 程春雷, 叶青, 彭琳, 沈友志. 开放信息抽取研究综述[J]. 计算机工程与应用, 2023, 59(16): 31-49. |
[8] | 王辰, 李明, 马金刚. 电子病历关系抽取综述[J]. 计算机工程与应用, 2023, 59(16): 63-73. |
[9] | 刘蓓, 许卓明, 陶皖, 刘三民. 少样本关系抽取研究综述[J]. 计算机工程与应用, 2023, 59(15): 27-37. |
[10] | 袁子博, 姚涛, 闫连山. 基于命名实体识别的违法广告词检测方法[J]. 计算机工程与应用, 2023, 59(15): 141-150. |
[11] | 杨冬, 田生伟, 禹龙, 周铁军, 王博. 快速联合实体和关系抽取模型[J]. 计算机工程与应用, 2023, 59(13): 164-170. |
[12] | 赵丹丹, 黄德根, 孟佳娜, 谷丰, 张攀. 多头注意力与字词融合的中文命名实体识别[J]. 计算机工程与应用, 2022, 58(7): 142-149. |
[13] | 熊中敏, 马海宇, 李帅, 张娜. 知识图谱在海洋领域的应用及前景分析综述[J]. 计算机工程与应用, 2022, 58(3): 15-33. |
[14] | 谢斌红, 王恩慧, 张英俊. 结合噪声网络的强化学习远程监督关系抽取[J]. 计算机工程与应用, 2022, 58(23): 169-177. |
[15] | 王勇, 江洋, 王红滨, 侯莎. 面向科技情报分析的知识库构建方法[J]. 计算机工程与应用, 2022, 58(22): 142-149. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||