计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (22): 165-171.DOI: 10.3778/j.issn.1002-8331.2104-0143

• 模式识别与人工智能 • 上一篇    下一篇

边信息嵌入的学术论文推荐算法研究

沈小烽,刘柏嵩,吴俊超,钱江波   

  1. 宁波大学 信息科学与工程学院,浙江 宁波 315211
  • 出版日期:2022-11-15 发布日期:2022-11-15

Academic Paper Recommendation Based on?Side Information Embedding

SHEN Xiaofeng, LIU Baisong, WU Junchao, QIAN Jiangbo   

  1. College of Information Science and Engineering, Ningbo University, Ningbo, Zhejiang 315211, China
  • Online:2022-11-15 Published:2022-11-15

摘要: 为了解决论文推荐领域中的数据稀疏性问题,研究人员通常会引入论文的辅助信息进行改进。然而,目前的研究大多集中于辅助信息的语义关联性,没有考虑到不同辅助信息对论文的重要性也不同。同时,在论文的网络表示领域中,随机游走的方法忽略了论文属性对论文引用关系的影响。针对这两个问题,提出了一种基于引文辅助信息嵌入的推荐方法(CERec)。首先提取论文的多种质量因素构成影响力数值,将其作为论文权重来构造影响力网络。然后将论文的影响力与引文信息结合,利用论文的多种辅助信息进行图嵌入。最后通过论文嵌入向量的余弦相似度得到推荐结果。离线实验结果表明,结合辅助信息的方法优于不结合辅助信息的方法,同时CERec相较于目前比较流行的向量表示推荐算法在召回率和NDCG上平均提高了5.054%和5.246%。

关键词: 论文推荐, 影响力网络, 边信息, 图嵌入, 冷启动

Abstract: In order to solve the problem of data sparsity in the paper recommendation, researchers usually use side information of the paper for improvement. Most of the current research focuses on the semantic relevance of side information, and does not consider that the importance of different side information to the paper is also different. Random walk method ignores the influence of paper attributes on paper citation relationships. Considering the above challenges, this paper proposes an academic paper recommendation approach based on citation side information embedding(CERec). Firstly, it extracts the various quality factors to form the influence value, and uses it as the paper weight to construct the influence network. Then, it combines the influence of the paper with the citation information, and embeds it with the side information. Finally, it calculates the cosine similarity of the embedding vector to get the recommended list. The experimental results demonstrate that the method combining side information is better than the method not combining side information. Compared with the current popular representation recommendation algorithms, CERec has improved the recall rate and NDCG by an average of 5.054% and 5.246%.

Key words: paper recommendation, influence network, side information, graph embedding, cold start