计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (12): 94-99.DOI: 10.3778/j.issn.1002-8331.2203-0285

• 模式识别与人工智能 • 上一篇    下一篇

融合负载中心性的科研学者兴趣挖掘算法

姜阳,薛哲,李昂   

  1. 北京邮电大学 计算机学院(国家示范性软件学院) 智能通信软件与多媒体北京市重点实验室,北京 100876
  • 出版日期:2023-06-15 发布日期:2023-06-15

Research Scholar Interest Mining Method Based on Load Centrality

JIANG Yang, XUE Zhe, LI Ang   

  1. Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, School of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Online:2023-06-15 Published:2023-06-15

摘要: 在大数据时代,通过论文、专利等数据挖掘出科研学者的兴趣能对学者画像构建、学者交流合作和科研成果分析产生重要作用,然而目前针对科研学者兴趣挖掘的研究工作相对较少,还有很多亟需解决的问题。提出了一种基于负载中心性的科研学者兴趣挖掘算法(load centrality based interest mining algorithm for research scholars,LCBIM),该算法能够针对科研学者论文和专利数据,准确提取科研学者兴趣领域的关键词,利用图聚合的思想来聚合邻域的特征空间以产生高质量的图节点,同时根据语义分析针对相似词或冗余信息进行顶点聚合来简化图结构,然后利用负载中心性原理计算图中节点的权重,分析得出科研学习的兴趣领域。该算法能够在拥有丰富语义信息的论文和专利中挖掘出学者的兴趣点。实验结果表明,提出的基于负载中心性的科研学者兴趣挖掘算法能够在论文和专利语料中快速有效地提取出科研学者的兴趣。

关键词: 兴趣挖掘, 数据挖掘, 负载中心性, 科研学者

Abstract: In the era of big data, it is possible to carry out cooperative research on the research results of researchers through papers, patents and other data, so as to study the role of researchers, and produce results in the analysis of results. For the important problems found in the research and application of reality, this paper also proposes a research scholar interest mining algorithm based on load centrality(load centrality based interest mining algorithm for research scholars, LCBIM), which can accurately solve the problem according to the researcher’s research papers and patent data. Graphs of creative algorithms in various fields of the study aggregated ideas, topic graphs are generated by aggregating neighborhoods, the generated topic information is used to construct with similar or similar topic spaces, and keywords are utilized to construct one or more topics. The regional structure of each topic can be used to closely calculate the weight of the centrality research model of the node, which can analyze the field in the complete coverage principle. The scientific research cooperation based on the load rate center proposed in this paper can effectively extract the interests of scientific research scholars from papers and corpus.

Key words: interest mining, data mining, load centrality, research scholar