计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (20): 69-74.

• 理论与研发 • 上一篇    下一篇

基于主题划分和链接划分的社团发现算法

欧阳骥1,周宪政2,卓晓燕2,黄  翰2   

  1. 1.东莞理工学院 计算机学院,广东 东莞 523808
    2.华南理工大学 软件学院,广州 510006
  • 出版日期:2016-10-15 发布日期:2016-10-14

Community-detection algorithm based on topic division and link division

OUYANG Ji1, ZHOU Xianzheng2, ZHUO Xiaoyan2, HUANG Han2   

  1. 1.School of Computer, Dongguan University of Technology, Dongguan, Guangdong 523808, China
    2.School of Software Engineering, South China University of Technology, Guangzhou 510006, China
  • Online:2016-10-15 Published:2016-10-14

摘要: 传统的社团发现算法利用链接关系对社团进行划分,不利于发现社团之间的非链接关系,从而影响划分精度。研究分析了节点蕴含的文本信息,挖掘了文本信息蕴含了节点的主题信息,根据这些主题信息判断社团在主题上的关系。研究设计了优化的潜在狄利克雷分配模型对社团进行主题划分,应用优化的模块度社团发现算法对社团进行链接划分,合并成为一个能对社团进行主题划分和链接划分的主题社团发现算法。此外,还针对主题社团设计了一种评估方法,并且使用多个数据集在主题社团发现的各个阶段对算法进行了实验验证。实验结果证明,基于主题检测的社团发现算法能够正确地对社团进行主题划分和链接划分。

关键词: 社会计算, 社团发现, 主题模型, 主题划分

Abstract: The link relationship is used to divide communities in traditional community detection algorithms, which has the disadvantage of detecting the non-link relationship and is bad for the division accuracy. In this paper, it analyzes the text information in nodes and extracts the hidden topic information to determine the relationship of the communities on topic. Based on this, it designs an optimized Dirichlet allocation model to divide the communities by topics. Furthermore, a community detecting algorithm of optimized module is applied to divide the communities by links. Both these two methods are then combined into a hybrid algorithm that is capable of topic division and link division. Moreover, an evaluation method is designed to verify the effectiveness of the proposed algorithm in each stage of the community detecting process. The results show that the proposed algorithm can conduct topic division and link division correctly for the communities.

Key words: social computing, community detection, topic model, topic division