计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (5): 107-114.DOI: 10.3778/j.issn.1002-8331.1911-0452

• 模式识别与人工智能 • 上一篇    下一篇

融合主题相似度权重的主题社区发现模型

钱芸芸,杨文忠,姚苗,李海磊,柴亚闯   

  1. 1.新疆大学 信息科学与工程学院,乌鲁木齐 830046
    2.新疆大学 软件学院,乌鲁木齐 830046
  • 出版日期:2021-03-01 发布日期:2021-03-02

Topic Community Discovery Model Incorporating Topic Similarity Weight

QIAN Yunyun, YANG Wenzhong, YAO Miao, LI Hailei, CHAI Yachuang   

  1. 1.College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
    2.School of Software, Xinjiang University, Urumqi 830046, China
  • Online:2021-03-01 Published:2021-03-02

摘要:

社交网络结构错综复杂,主题社区是进行个性化推荐和商业推广的重要途径之一。然而,现有主题社区挖掘方法,要么仅基于链接关系和文本信息挖掘主题社区,要么在已划分社区的基础上挖掘主题,忽略了主题与社区的相互作用,导致社区内部话题相似度不高。因此,提出新的社区主题计算方法,进而建立一种融合主题相似度权重的主题社区发现模型(TSWTCD)。利用文本信息提取主题,计算节点间主题相似度作为链接权重,将链接权重作为模块度参数划分社区。最后,根据提出新的社区主题计算方法得到社区主题。基于真实数据集的实验结果表明,TSWTCD模型提升了挖掘主题社区的质量。

关键词: 主题社区, 链接信息, 主题相似度, 模块度

Abstract:

Social network structure is complex, and the topic community is one of the important ways for personalized recommendation and business promotion. However, the existing methods of topic community mining are either only based on link relationship and text information mining topic community, or mining the topic based on the divided community, the interaction between topic and the community is ignored, which results in the low similarity of the topic within the community. Therefore, it proposes a new community topic computing method, then establishes a topic community discovery model (TSWTCD) that integrates topic similarity weights. The topic is extracted by text information, the topic similarity between nodes is calculated as link weight, and link weight is taken as module parameter to divide the community. Finally, the community topic is obtained according to the proposed new community topic calculation method. Experimental results based on real data set show that the TSWTCD model improves the quality of mining topic communities.

Key words: topic community, links information, topic similarity, modularity