计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (9): 182-189.DOI: 10.3778/j.issn.1002-8331.2201-0418

• 模式识别与人工智能 • 上一篇    下一篇

融入关系形式化概念的端到端三元组抽取

程春雷,邹静,叶青,张素华,蓝勇,杨瑞   

  1. 江西中医药大学 计算机学院,南昌 330004
  • 出版日期:2023-05-01 发布日期:2023-05-01

End-to-End Triple Extraction Incorporated Formal Concept of Relation

CHENG Chunlei, ZOU Jing, YE Qing, ZHANG Suhua, LAN Yong, YANG Rui   

  1. School of Computer, Jiangxi University of Chinese Medicine, Nanchang 330004, China
  • Online:2023-05-01 Published:2023-05-01

摘要: 三元组抽取是知识学习、图谱构建的基础性工作。针对当前任务模型多存在实体识别与关系抽取语义关联不强、实体嵌套、关系重叠,以及既有概念知识关注不多等问题,融合形式化概念与神经网络模型,提出一种基于关系形式化概念的端到端三元组抽取方法。提出关系形式化概念标签,以统一实体与关系的语义表达,把实体识别问题转换为概念标签学习问题;将实体输入到关系形式化概念注意力模型,该注意力机制力图捕获关系主客体概念的连通内涵特征,即训练获得每个关系标签对应的主客体及它们上下文依存谓词的综合特征;通过多个关系分类器输出每对主客体的多关系标签,实现基于概念连通的多关系抽取;另外模型还可引入既有形式化概念的外延、内涵,以改善模型对语料标签的依赖,以及实体嵌套导致的标记难题。实验基于两份数据集,采用三种评价指标对模型性能进行评价分析。实验结果证明提出的模型在知识抽取上具有切实可行的效果,可改善实体嵌入、关系重叠问题。

关键词: 三元组抽取, 形式化概念, 注意力, 关系重叠

Abstract: Triple extraction is the basic work of knowledge learning and knowledge graph construction. Aiming at the problems of entity recognition and relation extraction in current task models such as weak semantic association, entity nesting, relation overlap, and not much attention to existing concept knowledge, combining formal concept and neural network model, an end-to-end triple extraction method based on formal concept of relation is proposed. The model puts forward formal concept label of relation to unify the semantic expression of the entity and the relation, and converts the entity recognition problem into the concept label learning problem. The entity is input into the relational formal concept attention model, and the attention mechanism tries to capture the main relation. The connected connotation feature of the object concept, that is, the comprehensive features of the subject and object corresponding to each relation label and their context-dependent predicates are obtained through. The multiple relation labels of each pair of subject and object are output through multiple relation classifiers to realize the concept-based connectivity multi-relation extraction. In addition, the model can also introduce the extension and connotation of existing formal concept to improve the dependence on corpus tags and the tagging difficulties of model caused by entity nesting. The results on two datasets prove that the proposed model has practical effects on knowledge extraction, and can improve the problems of entity embedding and relation overlap.

Key words: triple extraction, formal concept, attention, relation overlap