Computer Engineering and Applications ›› 2015, Vol. 51 ›› Issue (6): 93-98.

Previous Articles     Next Articles

Micro-blog topic detection method based on overlap community detection

CHENG Fei, JI Donghong   

  1. Computer School, Wuhan University, Wuhan 430072, China
  • Online:2015-03-15 Published:2015-03-13

基于重叠社团发现的微博话题检测方法

程  飞,姬东鸿   

  1. 武汉大学 计算机学院,武汉 430072

Abstract: Micro-blog topic detection has become the current research focuses. A micro-blog topics detection method based on overlap community detection is proposed. Raw micro-blog data will be preprocessed. After micro-blog content segmentation, topic words will be extracted with part of speech and temporal distribution. And edge is created between high relevance topic words to get a complex network. The concept of community independent modularity is introduced, and the overlap community is detected with the model of community independent modularity maximization; each community is taken as a micro-blog topic. The method of overlap community detection can solve the problem of low accurate rate of topic detecting caused by one or more keywords belonging to more than one topic. The experiments results show that the proposed method is feasible in micro-blog topic detection.

Key words: micro-blog, topic detection, complex network, overlap community detection

摘要: 微博话题检测是当前研究的热点,提出一种基于复杂网络重叠社团发现的微博话题检测方法。该方法对一段时间内的微博数据进行预处理,在分词后,根据词性以及词的时域分布抽取出主题词,在相关度高的主题词之间构造边得到复杂网络。引入社团独立模块度的概念,并通过社团独立模块度最大化模型发现重叠社团,把每个社团看成一个微博话题。重叠社团发现的方法可以解决由一个或多个主题词属于多个话题引起的话题检测准确率低的问题。实验结果证明了该方法在微博话题检测中的有效性。

关键词: 微博, 话题检测, 复杂网络, 重叠社团发现