计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (21): 131-138.DOI: 10.3778/j.issn.1002-8331.1908-0163

• 模式识别与人工智能 • 上一篇    下一篇

图正则化非负矩阵分解的异质网社区发现

刘家骥,包崇明,周丽华,王崇云,孔兵   

  1. 1.云南大学 信息学院,昆明 650091
    2.云南大学 软件学院,昆明 650091
    3.云南大学 生态学与环境学院,昆明 650091
  • 出版日期:2020-11-01 发布日期:2020-11-03

Community Detecting Method Based on Non-negative Matrix Factorization with Graph Regular Term in Heterogeneous Information Networks

LIU Jiaji, BAO Chongming, ZHOU Lihua, WANG Chongyun, KONG Bing   

  1. 1.School of Information Science and Engineering, Yunnan University, Kunming 650091, China
    2.School of Software, Yunnan University, Kunming 650091, China
    3.Institute of Ecology and Geobotany, Yunnan University, Kunming 650091, China
  • Online:2020-11-01 Published:2020-11-03

摘要:

挖掘数据网络中有价值的、具有稳定性的社区,对网络信息的获取、推荐及网络的演化预测具有重要的价值。针对现有异质网络聚类方法难以在同一维度有效整合网络中异质信息的问题,提出了一种基于图正则化非负矩阵分解的异质网络聚类方法。通过加入图正则项,将中心类型子空间和属性类型子空间的内部连接关系作为约束项,引入到非负矩阵分解模型中,从而找到高维数据在低维空间的紧致嵌入,成功消除了异质节点之间的部分噪声,同时,对反映不同子网络共有潜在结构的共识矩阵进行优化,有效整合异质信息,并且在降维过程中较大限度地保留了异质信息的完整性,提高了异质网络聚类方法的精度,在真实世界数据集上的实验结果也验证了该方法的有效性。

关键词: 异质网络, 社区发现, 非负矩阵分解, 图正则化

Abstract:

Mining valuable and stable communities in data networks is of great value for the acquisition, recommendation and evolution of network information. Aiming at the problem that the existing heterogeneous network clustering method is difficult to effectively integrate heterogeneous information in the network in the same dimension, this paper proposes a heterogeneous network clustering method based on graph regularization non-negative matrix factorization. By adding the graph regularization term, the internal connection relationship between the central type subspace and the attribute type subspace is introduced as a constraint item into the non-negative matrix decomposition model, thereby finding the compact embedding of high dimensional data in low dimensional space, which successfully eliminates partial noise between heterogeneous nodes. At the same time, it optimizes the consensus matrix reflecting the common structure of different sub-networks, effectively integrates heterogeneous information, and preserves the integrity of heterogeneous information to a large extent during the dimension reduction process. The accuracy of the heterogeneous network clustering method is improved, and the experimental results on the real world dataset also verify the effectiveness of the method.

Key words: heterogeneous network, community detecting, non-negative matrix factorization, graphregular term