Computer Engineering and Applications ›› 2013, Vol. 49 ›› Issue (3): 213-221.

Previous Articles     Next Articles

Survey on spectral clustering and its applications in social networks

MENG Qinxue, Paul J. Kennedy   

  1. Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW 2007, Australia
  • Online:2013-02-01 Published:2013-02-18

谱聚类的现状及其在社会网络中的应用

孟钦学,Paul J. Kennedy   

  1. 悉尼科技大学 工程与信息技术学院,新南威尔士州 悉尼市 2007

Abstract: In recent years, analyzing social networks based on data clustering becomes one of the most popular topics as it has much significance. The direct benefit of these researches is to prevent terrorist attacks and detect the spread of diseases by detecting communities. Moreover, it is easy to boost social development and social cooperation after understanding structures of social networks clearly because social networks are dynamic networks and data clustering can predict changes of social ties. From the data mining aspect, social networks are incomplete, huge, complex and dynamic networks, and traditional data clustering methods do not work well in social areas due to these features. Conversely, spectral clustering, as one of the most popular modern data clustering algorithms, offers a systematic, flexible and practical solution to problems about social networks. The theories and experiments prove that it outperforms traditional data clustering algorithms in achieving global solutions and processing large datasets. This paper is to review the current theories and methodology of spectral clustering and its advantages when compared with traditional data clustering algorithms. On the other hand, some fundamental knowledge of social networks and two typical spectral clustering applications in social areas are also covered.

Key words: spectral clustering, social networks, similarity matrix, Laplacian matrix

摘要: 近年来,凭借其重要的研究意义,采用数据聚类去分析社会网络已成为时下最热门的话题之一。这些研究最直接应用的是防止恐怖袭击和社区通过检测疾病的传播。此外,由于社会网络是动态的,而社会关系的变化是可以通过数据聚类方法预测的。从而使得清楚了解社会网络结构将有助于促进社会发展和社会成员间的合作。从数据挖掘角度来看,社交网络是一种不完全的,庞大的,复杂的,动态的网络。而这些特性使得传统的数据聚类方法并不能成功应用在社会网络中。相反,作为一个最流行的现代数据的聚类算法,谱聚类在对社交网络的问题提供了一种系统的,灵活实用的解决方案。理论和实验证明,谱聚类在寻找全局最优解和处理大型数据集方面的性能优于传统聚类算法。一方面审视讨论当今谱聚类的理论和算法,及其优于传统聚类算法的特点。另一方面,也涵盖了社会网络的基本知识及两个典型的谱聚类在社会网络中的应用。

关键词: 谱聚类, 社会网络, 相似度矩阵, 拉普拉斯矩阵