计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (16): 63-73.DOI: 10.3778/j.issn.1002-8331.2209-0366
王辰,李明,马金刚
出版日期:
2023-08-15
发布日期:
2023-08-15
WANG Chen, LI Ming, MA Jingang
Online:
2023-08-15
Published:
2023-08-15
摘要: 信息抽取在电子病历上的应用取得丰富的研究成果,使得非结构化的生物医学数据得以利用。关系抽取是信息抽取的重要子任务,是从数据转化为知识的桥梁。根据关系抽取存在的不同问题以及不同解决方案,对关系抽取进行详细分类。整理了电子病历关系抽取领域的相关评测任务和具有代表性的数据集。分阶段对关系抽取在电子病历文本上的应用进展进行综述,重点介绍了深度学习方法在关系抽取上的广泛应用,以及现阶段预训练模型在电子病历关系抽取任务上的进展。对该领域进行展望,提出了未解决的问题以及未来的研究方向。
王辰, 李明, 马金刚. 电子病历关系抽取综述[J]. 计算机工程与应用, 2023, 59(16): 63-73.
WANG Chen, LI Ming, MA Jingang. Review of Relation Extraction in Electronic Medical Records[J]. Computer Engineering and Applications, 2023, 59(16): 63-73.
[1] 杨锦锋,于秋滨,关毅,等.电子病历命名实体识别和实体关系抽取研究综述[J].自动化学报,2014,40(8):1537-1562. YANG J F,YU Q B,GUAN Y,et al.An overview of research on electronic medical record oriented named entity recognition and entity relation extraction[J].Acta Automatica Sinica,2014,40(8):1537-1562. [2] 杨锦锋,关毅,何彬,等.中文电子病历命名实体和实体关系语料库构建[J].软件学报,2016,27(11):2725-2746. YANG J F,GUAN Y,HE B,et al.Corpus construction for named entities and entity relations on Chinese electronic medical records[J].Journal of Software,2016,27(11):2725-2746. [3] GRISHMAN R,SUNDHEIM B M.Message understanding conference-6:a brief history[C]//Proceedings of the 16th International Conference on Computational Linguistics,1996. [4] UZUNER ?,SOUTH B R,SHEN S,et al.2010 i2b2/VA challenge on concepts,assertions,and relations in clinical text[J].Journal of the American Medical Informatics Association,2011,18(5):552-556. [5] ZHAN K,PENG W,XIONG Y,et al.Novel graph-based model with biaffine attention for family history extraction from clinical text:modeling study[J].JMIR Medical Informatics,2021,9(4):e23587. [6] SEGURA-BEDMAR I,MARTíNEZ P,HERRERO-ZAZO M.SemEval-2013 task 9:extraction of drug-drug interactions from biomedical texts(DDIExtraction 2013)[C]//Proceedings of the 7th International Workshop on Semantic Evaluation,2013:341-350. [7] LI J,SUN Y,JOHNSON R J,et al.BioCreative V CDR task corpus:a resource for chemical disease relation extraction[J].Database,2016.DOI:10.1093/database/baw068. [8] KRALLINGER M,LEITNER F,RODRIGUEZ-PENAGOS C,et al.Overview of the protein-protein interaction annotation extraction task of BioCreative II[J].Genome Biology,2008,9(2):1-19. [9] 崔博文,金涛,王建民.自由文本电子病历信息抽取综述[J].计算机应用,2021,41(4):1055-1063. CUI B,JIN T,WANG J M,et al.Overview of information extraction of free-text electronic medical records[J].Journal of Computer Applications,2021,9(4):1055-1063. [10] NORIEGA-ATALA E,HEIN P D,THUMSI S S,et al.Extracting inter-sentence relations for associating biological context with events in biomedical texts[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2019,17(6):1895-1906. [11] LIU X,FAN J,DONG S.Document-level biomedical rela-tion extraction leveraging pretrained self-attention structure and entity replacement:algorithm and pretreatment method validation study[J].JMIR Medical Informatics,2020,8(5):e17644. [12] 王勇超,穆华岭,周灵智,等.基于指针网络的实体与关系联合抽取方法[J].计算机应用研究,2021,38(4):1004-1007. WANG Y C,MU H L,ZHOU L Z,et al.Joint extraction method of entity and relationship based on pointer nework[J].Application Research of Computers,2021,38(4):1004-1007. [13] 昝红英,关同峰,张坤丽,等.面向医学文本的实体关系抽取研究综述[J].郑州大学学报(理学版),2020,52(4):1-15. ZAN H Y,GUAN T F,ZHANG K L,et al.A review of research on entity relationship extraction for medical texts[J].Journal of Zhengzhou University(Natural Science Edition),2020,52(4):1-15. [14] 王传栋,徐娇,张永.实体关系抽取综述[J].计算机工程与应用,2020,56(12):25-36. WANG C D,XU J,ZHANG Y.Survey of entity relation extraction[J].Computer Engineering and Applications,2020,56(12):25-36. [15] PENG N,POON H,QUIRK C,et al.Cross-sentence n-ary relation extraction with graph lstms[J].Transactions of the Association for Computational Linguistics,2017,5:101-115. [16] SONG L,ZHANG Y,WANG Z,et al.N-ary relation extraction using graph state LSTM[J].arXiv:1808.09101,2018. [17] SHEN F,LIU S,FU S,et al.Family history extraction from synthetic clinical narratives using natural language processing:overview and evaluation of a challenge data set and solutions for the 2019 national NLP clinical challenges (N2C2)/open health natural language processing(OHNLP) competition[J].JMIR Medical Informatics,2021,9(1):e24008. [18] YE Y,HU B,ZHANG K,et al.Construction of corpus for entity and relation annotation of diabetes electronic medical records[C]//Proceedings of the 20th Chinese National Conference on Computational Linguistics,2021:622-632. [19] CHANG H,ZAN H,MA Y,et al.Corpus construction for named-entity and entity relations for electronic medical records of stroke disease[C]//Proceedings of the 20th Chinese National Conference on Computational Linguistics,2021:633-642. [20] JELIER R,JENSTER G,DORSSERS L C J,et al.Co-occurrence based meta-analysis of scientific texts:retrieving biological relationships between genes[J].Bioinformatics,2005,21(9):2049-2058. [21] 吴宗友,白昆龙,杨林蕊,等.电子病历文本挖掘研究综述[J].计算机研究与发展,2021,58(3):513-527. WU Z Y,BAI K L,YANG L R,et al.Review on text mining of electronic medical record[J].Journal of Computer Research and Development,2021,58(3):513-527. [22] SUN W,RUMSHISKY A,UZUNER O.Evaluating temporal relations in clinical text:2012 I2B2 challenge[J].Journal of the American Medical Informatics Association,2013,20(5):806-813. [23] CHANG Y C,DAI H J,WU J C Y,et al.TEMPTING system:a hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries[J].Journal of Biomedical Informatics,2013,46:S54-S62. [24] MIAO Q,ZHANG S,ZHANG B,et al.Extracting and visualizing semantic relationships from Chinese biomedical text[C]//Proceedings of the 26th Pacific Asia Conference on Language,Information,and Computation,2012:99-107. [25] BUNDSCHUS M,DEJORI M,STETTER M,et al.Extraction of semantic biomedical relations from text using conditional random fields[J].BMC Bioinformatics,2008,9(1):1-14. [26] BHASURAN B,NATARAJAN J.Automatic extraction of gene-disease associations from literature using joint ensemble learning[J].PLoS ONE,2018,13(7):e0200699. [27] HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507. [28] LIU C Y,SUN W B,CHAO W H,et al.Convolution neural network for relation extraction[C]//International Conference on Advanced Data Mining and Applications.Berlin,Heidelberg:Springer,2013:231-242. [29] KUMAR S.A survey of deep learning methods for relation extraction[J].arXiv:1705.03645,2017. [30] ZENG D,LIU K,LAI S,et al.Relation classification via convolutional deep neural network[C]//Proceedings of the 25th International Conference on Computational Linguistics:Technical Papers,2014:2335-2344. [31] NGUYEN T H,GRISHMAN R.Relation extraction:perspective from convolutional neural networks[C]//Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing,2015:39-48. [32] SAHU S K,ANAND A,ORUGANTY K,et al.Relation extraction from clinical texts using domain invariant convolutional neural network[J].arXiv:1606.09370,2016. [33] MAHENDRAN D,MCINNES B T.Extracting adverse drug events from clinical notes[J].AMIA Summits on Translational Science Proceedings,2021:420. [34] YANG Y,WU Z,YANG Y,et al.A survey of information extraction based on deep learning[J].Applied Sciences,2022,12(19):9691. [35] SOCHER R,HUVAL B,MANNING C D,et al.Semantic compositionality through recursive matrix-vector spaces[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning,2012:1201-1211. [36] NINGTHOUJAM D,YADAV S,BHATTACHARYYA P,et al.Relation extraction between the clinical entities based on the shortest dependency path based lstm[J].arXiv:1903.09941,2019. [37] GRAVES A,SCHMIDHUBER J.Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J].Neural networks,2005,18(5/6):602-610. [38] YU Y,SI X,HU C,et al.A review of recurrent neural net works:LSTM cells and network architectures[J].Neural Computation,2019,31(7):1235-1270. [39] 张世豪,杜圣东,贾真,等.基于深度神经网络和自注意力机制的医学实体关系抽取[J].计算机科学,2021,48(10):77-84. ZHANG S H,DU S D,JIA Z,et al.Medical entity relation extraction based on deep neural network and self-attention mechanism[J].Computer Science,2021,48(10):77-84. [40] ZHANG Y,LI X,ZHANG Z.Disease-pertinent knowledge extraction in online health communities using GRU based on a double attention mechanism[J].IEEE Access,2020,8:95947-95955. [41] MIWA M,BANSAL M.End-to-end relation extraction using LSTMs on sequences and tree structures[J].arXiv:1601.00770,2016. [42] HU Q,LIU N,WANG J,et al.An overlapping sequence tagging mechanism for symptoms and details extraction on Chinese medical records[J].Computers & Electrical Engineering,2021,91:107019. [43] SONG L,ZHANG Y,GILDEA D,et al.Leveraging dependency forest for neural medical relation extraction[J].arXiv:1911.04123,2019. [44] ZENG S,XU R,CHANG B,et al.Double graph based reasoning for document-level relation extraction[J].arXiv:2009.13752,2020. [45] LI T,XIONG Y,WANG X,et al.Document-level medical relation extraction via edge-oriented graph neural network based on document structure and external knowledge[J].BMC Medical Informatics and Decision Making,2021,21(7):1-9. [46] SUN Q,XU T,ZHANG K,et al.Dual-channel and hierarchical graph convolutional networks for document-level relation extraction[J].Expert Systems with Applications,2022,205:117678. [47] FU T J,LI P H,MA W Y.GraphRel:modeling text as relational graphs for joint entity and relation extraction[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,2019:1409-1418. [48] TREISMAN A M,GELADE G.A feature-integration theory of attention[J].Cognitive Psychology,1980,12(1):97-136. [49] ZHANG Z,ZHOU T,ZHANG Y,et al.Attention-based deep residual learning network for entity relation extraction in Chinese EMRs[J].BMC Medical Informatics and Decision Making,2019,19(2):171-177. [50] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems 30,2017:5998-6008. [51] 任欢,王旭光.注意力机制综述[J].计算机应用,2021,41(S1):1-6. REN H,WANG X G.Review of attention mechanism[J].Journal of Computer Applications,2021,41(S1):1-6. [52] 车万翔,刘挺.自然语言处理新范式:基于预训练模型的方法[J].中兴通讯技术,2022,28(2):3-9. CHE W X,LIU T.New paradigm of natural language processing:a method based on pre-trained models[J].ZTE Communications,2022,28(2):3-9. [53] DEVLIN J,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [54] WEI Z,SU J,WANG Y,et al.A novel cascade binary tagging framework for relational triple extraction[J].arXiv:1909.03227,2019. [55] WANG Y,YU B,ZHANG Y,et al.TPLinker:single-stage joint extraction of entities and relations through token pair linking[J].arXiv:2010.13415,2020. [56] ZHENG H,WEN R,CHEN X,et al.PRGC:potential relation and global correspondence based joint relational triple extraction[J].arXiv:2106.09895,2021. [57] ZHOU Y,YAN Y,HAN R,et al.Clinical temporal relation extraction with probabilistic soft logic regularization and global inference[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence,2021:14647-14655. [58] ZHANG X,ZHANG Y,ZHANG Q,et al.Extracting comprehensive clinical information for breast cancer using deep learning methods[J].International Journal of Medical Informatics,2019,132:103985. [59] GAO S,DU J,ZHANG X.Research on relation extraction method of Chinese electronic medical records based on BERT[C]//Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence,2020:487-490. [60] LIN C,MILLER T,DLIGACH D,et al.A BERT-based universal model for both within-and cross-sentence clinical temporal relation extraction[C]//Proceedings of the 2nd Clinical Natural Language Processing Workshop,2019:65-71. [61] XUE K,ZHOU Y,MA Z,et al.Fine-tuning BERT for joint entity and relation extraction in Chinese medical text[C]//Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine,2019:892-897. [62] LEE J,YOON W,KIM S,et al.BioBERT:a pre-trained biomedical language representation model for biomedical text mining[J].Bioinformatics,2020,36(4):1234-1240. [63] LI F,JIN Y,LIU W,et al.Fine-tuning bidirectional encoder representations from transformers(BERT)-based models on large-scale electronic health record notes:an empirical study[J].JMIR Medical Informatics,2019,7(3):e14830. [64] ALSENTZER E,MURPHY J R,BOAG W,et al.Publicly available clinical BERT embeddings[J].arXiv:1904.03323,2019. [65] ALIMOVA I,TUTUBALINA E.Multiple features for clinical relation extraction:a machine learning approach[J].Journal of Biomedical Informatics,2020,103:103382. [66] NASAR Z,JAFFRY S W,MALIK M K.Named entity recognition and relation extraction:state-of-the-art[J].ACM Computing Surveys,2021,54(1):1-39. [67] 杨穗珠,刘艳霞,张凯文,等.远程监督关系抽取综述[J].计算机学报,2021,44(8):1636-1660. YANG H Z,LIU Y X,ZHANG K W,et al.Survey on distantly supervised relation extraction[J].Chinese Journal of Computers,2021,44(8):1636-1660. [68] LIANG T,LIU Y,LIU X,et al.Distantly-supervised long-tailed relation extraction using constraint graphs[J].IEEE Transactions on Knowledge and Data Engineering,2023,35(7):6852-6865. [69] VERGA P,STRUBELL E,MCCALLUM A.Simultaneously self-attending to all mentions for full-abstract biological relation extraction[J].arXiv:1802.10569,2018. [70] 陈烨,周刚,卢记仓.多模态知识图谱构建与应用研究综述[J].计算机应用研究,2021,38(12):3535-3543. CHEN Y,ZHOU G,LU J C.Survey on construction and application research for multi-modal knowledge graphs[J].Application Research of Computers,2021,38(12):3535-3543. |
[1] | 陈吉尚, 哈里旦木·阿布都克里木, 梁蕴泽, 阿布都克力木·阿布力孜, 米克拉依·艾山, 郭文强. 深度学习在符号音乐生成中的应用研究综述[J]. 计算机工程与应用, 2023, 59(9): 27-45. |
[2] | 姜秋香, 郭伟鹏, 王子龙, 欧阳兴涛, 隆睿睿. Python语言在水文水资源领域中的应用与展望[J]. 计算机工程与应用, 2023, 59(9): 46-58. |
[3] | 郑肇谦, 韩东辰, 赵辉. 单步片段标注的实体关系联合抽取模型[J]. 计算机工程与应用, 2023, 59(9): 130-139. |
[4] | 罗会兰, 陈翰. 时空卷积注意力网络用于动作识别[J]. 计算机工程与应用, 2023, 59(9): 150-158. |
[5] | 刘华玲, 皮常鹏, 赵晨宇, 乔梁. 基于深度域适应的跨域目标检测算法综述[J]. 计算机工程与应用, 2023, 59(8): 1-12. |
[6] | 何家峰, 陈宏伟, 骆德汉. 深度学习实时语义分割算法研究综述[J]. 计算机工程与应用, 2023, 59(8): 13-27. |
[7] | 张艳青, 马建红, 韩颖, 曹仰杰, 李颉, 杨聪. 真实场景下图像超分辨率重建研究综述[J]. 计算机工程与应用, 2023, 59(8): 28-40. |
[8] | 岱超, 刘萍, 史俊才, 任鸿杰. 利用U型网络的遥感影像建筑物规则化提取[J]. 计算机工程与应用, 2023, 59(8): 105-116. |
[9] | 王静, 金玉楚, 郭苹, 胡少毅. 基于深度学习的相机位姿估计方法综述[J]. 计算机工程与应用, 2023, 59(7): 1-14. |
[10] | 蒋玉英, 陈心雨, 李广明, 王飞, 葛宏义. 图神经网络及其在图像处理领域的研究进展[J]. 计算机工程与应用, 2023, 59(7): 15-30. |
[11] | 周玉蓉, 张巧灵, 于广增, 徐伟强. 基于声信号的工业设备故障诊断研究综述[J]. 计算机工程与应用, 2023, 59(7): 51-63. |
[12] | 韦健, 赵旭, 李连鹏. 融合位置信息注意力的孪生弱目标跟踪算法[J]. 计算机工程与应用, 2023, 59(7): 198-206. |
[13] | 赵宏伟, 郑嘉俊, 赵鑫欣, 王胜春, 李浥东. 基于双模态深度学习的钢轨表面缺陷检测方法[J]. 计算机工程与应用, 2023, 59(7): 285-293. |
[14] | 高腾, 张先武, 李柏. 深度学习在安全帽佩戴检测中的应用研究综述[J]. 计算机工程与应用, 2023, 59(6): 13-29. |
[15] | 蒋心璐, 陈天恩, 王聪, 李书琴, 张宏鸣, 赵春江. 农业害虫检测的深度学习算法综述[J]. 计算机工程与应用, 2023, 59(6): 30-44. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||