Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (1): 165-174.DOI: 10.3778/j.issn.1002-8331.2007-0219

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

Short Text Entity Link Based on Domain Knowledge Graph

HUANG Jinjie, ZHAO Xuanwei, ZHANG Xinyao, MA Jingping, SHI Yuqi   

  1. School of Automation, Harbin University of Science and Technology, Harbin 150080, China
  • Online:2022-01-01 Published:2022-01-06

基于领域知识图谱的短文本实体链接

黄金杰,赵轩伟,张昕尧,马敬评,史宇奇   

  1. 哈尔滨理工大学 自动化学院,哈尔滨 150080

Abstract: The task of entity linking is to identify potential entity references in the text and link it to unambiguous entity in a given knowledge base. In most cases, the entity link may contain the lack of effective contextual information in the Chinese short text, which leads to the ambiguity of polysemy. In the process of candidate linking, the uncertain correlation of candidate entities also affects the accuracy of candidate linking. Aiming at the above two problems, an entity link model based on the combination of deep neural network and association graph is proposed. The model adds character features, context, and deep semantics of information to enhance the representation of references and entities and performs similarity matching. The Fast-newman algorithm is used to divide the graph knowledge base into different types of entity clusters, and the entity clusters of the candidate entities with the highest similarity calculation scores are mapped to the relationship plane to construct the cluster entity association graph. The biased random walk algorithm examines the semantic relevance between candidate entities, calculates the matching degree between the reference and the candidate entity, and inputs linked entity. The model can realize the accurate link of short text to the target entity of the knowledge graph.

Key words: entity link, neural network, association graph, similarity calculation, semantic relativity

摘要: 实体链接任务是识别文本中潜在的实体指称,并将其链接到给定知识库中无歧义的实体上。在绝大多数情况下,实体链接可能存在中文短文本缺乏有效上下文信息,导致存在一词多义的歧义现象;同时候选链接过程中,候选实体的不确定相关性也影响候选实体链接精确性。针对上述两个问题,提出深度神经网络与关联图相结合的实体链接模型。模型添加字符特征、上下文、信息深层语义来增强指称和实体表示,并进行相似度匹配。利用Fast-newman算法将图谱知识库聚类划分不同类型实体簇,将相似度计算得分最高候选实体所属实体簇映射到关系平面,构建聚类实体关联图。利用偏向随机游走算法考查候选实体之间语义相关度,计算指称与候选实体的匹配程度,输入链接实体。该模型可以实现短文本到知识图谱目标实体的准确链接。

关键词: 实体链接, 神经网络, 关联图, 相似度计算, 语义相关