Computer Engineering and Applications ›› 2017, Vol. 53 ›› Issue (15): 155-163.DOI: 10.3778/j.issn.1002-8331.1610-0332

Previous Articles     Next Articles

Research on improved algorithm for collaborative prediction of heterogeneous links

SHENG Quanwei1, WANG Yibai1, GAO Yang2   

  1. 1.The Information Engineering College, Changsha Medical University, Changsha 410219, China
    2.Department of Computer Science and Technology, Nanjing University, Nanjing 210046, China
  • Online:2017-08-01 Published:2017-08-14


盛权为1,汪一百1,高  阳2   

  1. 1.长沙医学院 信息工程学院,长沙 410219
    2.南京大学 计算机科学与技术系,南京 210046

Abstract: The existing link prediction methods only consider the single link type prediction or the independent prediction of multiple- link types, which often makes the prediction results are not accurate enough. To solve this problem, this paper studyies the collective prediction problem of multiple-link types in heterogeneous information networks. First of all, according to the current link information between the similar nodes of the source nodes and the similar nodes of the target nodes, the homophily connection principle is introduced and the relatedness index for different types of nodes is designed, which is used to describe the link existence probability between different types of nodes, and can be combined with the traditional proximity index to the heterogeneous link prediction. Then, the labeled and unlabeled data in heterogeneous information networks are fused, and a heterogeneous collective link prediction algorithm is proposed to predict multiple-link types collectively by obtaining complex relationships among different types of links and combining the complementary prediction information. Experimental results based on real scenes show that the proposed collective prediction approach of link can effectively improve link prediction performances in heterogeneous information networks.

Key words: heterogeneous information networks, relatedness index, homophily connection, link existence probability, collaborative prediction, proximity index

摘要: 现有的链路预测方法仅考虑单种链路类型预测或多种链路类型的独立预测,经常使得预测结果不够准确。为此,研究了异构信息网络中多种链路类型的协同预测问题。根据源节点的相似节点和目标节点的相似节点之间的当前链路信息,提出了同质连接原理,设计了一种针对不同类型节点的相关性指标,用于描述不同类型节点间的链路存在概率,并将其与传统的邻近性指标相结合拓展到异构链路预测中。然后,将异构信息网络中的被标记数据和无标记数据融合起来,提出一种异构链路协同预测算法(Heterogeneous Collective Link Prediction, HCLP),通过获得不同类型链路间的各种复杂关系,结合互补性预测信息,实现多种链路类型的协同预测。基于真实场景的实验结果表明,所提的链路协同预测方法可有效提升异构信息网络的链路预测性能。

关键词: 异构信息网络, 相关性指标, 同质连接, 链路存在概率, 协同预测, 邻近性指标