计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (1): 121-126.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

异质信息网络下面向权威度的信息排序模型

陈云志   

  1. 杭州职业技术学院,杭州 310018
  • 出版日期:2014-01-01 发布日期:2013-12-30

Prestige based information ranking model in heterogeneous information networks

CHEN Yunzhi   

  1. Hangzhou Vocational & Technical College, Hangzhou 310018, China
  • Online:2014-01-01 Published:2013-12-30

摘要: 随着网络的持续发展,数据量以惊人的速度增长,冗余信息大量存在,同时数据间存在着复杂的关联关系,这使得现有的排序方法面临着严重的问题:信息冗余影响排序结果。基于异质信息网络,希望得到同时具有权威性、多样性的多目标排序模型。该模型将数据建模成一个异质信息网络,使用MutualRank通过直接在异质信息网络上的随机游走来更好地建模对象的权威度;使用PDRank融合各个对象的权威度及对象之间的多样性,最终能得到同时具备权威度及多样性的排序序列。该模型直接利用数据中的异质关联关系对对象的权威度进行建模,解决了数据冗余的问题。通过实验证明了MutualRank对于权威度的学习效果优于传统的PageRank,同时基于两阶段排序模型得到的排序结果也优于已有的基准方法。

关键词: 异质信息网络, MutualRank算法, 权威度排序, PDRank算法, 排序模型

Abstract: The amount of data is increasing rapidly with the continuous development of Internet. There also exists a large number of redundant information and many complex relationships between data on the Internet. Ranking methods are now facing serious problem that redundant information makes the results bad. This paper focuses on how to get authoritative, diverse and understandable ranking results on heterogeneous information network. The paper proposes a ranking model by simultaneously exploring prestige and diversity. The model constructs a heterogeneous information network based on the data, and then uses MutualRank to learn the prestige of the objects using PDRank model which combines the learned prestige and diversity. The ranking model uses the homogeneous and heterogeneous relationship between objects, and eliminates redundant information in ranking result. Experiments show that MutualRank is better than PageRank on modeling prestige, and the ranking results based on two-phase ranking model are superior to the existing base method.

Key words: heterogeneous information network, MutualRank algorithm, prestige ranking, PDRank algorithm, ranking model