Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (1): 1-11.DOI: 10.3778/j.issn.1002-8331.2107-0359

• Research Hotspots and Reviews • Previous Articles     Next Articles

Survey of Overlapping Entities and Relations Extraction

FENG Jun, ZHANG Tao, HANG Tingting   

  1. Key Laboratory of Water Big Data Technology of Ministry of Water Resources, School of Computer and Information,  Hohai University, Nanjing 211100, China
  • Online:2022-01-01 Published:2022-01-06



  1. 河海大学 计算机与信息学院 水利部水利大数据重点实验室,南京 211100

Abstract: Entity relation extraction can extract knowledge of facts from text, which is an important task in the field of natural language processing. Traditional relation extraction pays more attention to the relation of a single entity pair, However, there are more than one entity pair as well as overlaps between entities. Therefore, it is important to research overlapping entity relation extraction. With the development of the task, there are three classes method:Seq2Seq, graph-based and pre-trained language model. Seq2Seq-based methods make use of labeling strategies and copy mechanisms, graph-based methods are mainly based on static graphs and dynamic graphs, and pre-training language models are methods mining potential semantic features on BERT. This paper reviews the development of the task, discusses and analyzes the pros sand cons of each model. It combines with the recent developments of research to prospects for future research directions.

Key words: overlapping entities and relations, deep learning, graph neural network, pre-training language model

摘要: 实体关系抽取能够从文本中提取事实知识,是自然语言处理领域中重要的任务。传统关系抽取更加关注于单实体对的关系,但是句子内包含不止一对实体且实体间存在重叠现象,因此重叠实体关系抽取任务具有重大研究价值。任务发展至今,总体可以分为基于序列到序列、基于图和基于预训练语言模型三种方式。基于序列到序列的方式主要以标注策略和复制机制的方法为主,基于图的方式主要以静态图和动态图的方法为主,基于预训练语言模型的方式主要以BERT挖掘潜在语义特征的方法为主。回顾该任务的发展历程,讨论分析每种模型的优势及不足点;结合目前研究的最近动态,对未来的研究方向进行展望。

关键词: 重叠实体关系抽取, 深度学习, 图神经网络, 预训练语言模型