计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (13): 27-35.DOI: 10.3778/j.issn.1002-8331.2111-0411

• 热点与综述 • 上一篇    下一篇

协同过滤中相似度算法研究进展

刘华玲,郭渊,马俊   

  1. 上海对外经贸大学 统计与信息学院,上海 201620
  • 出版日期:2022-07-01 发布日期:2022-07-01

Research Progress of Similarity Algorithm in Collaborative Filtering

LIU Hualing, GUO Yuan, MA Jun   

  1. School of Statistics and Information, Shanghai University of International Business and Economics, Shanghai 201620, China
  • Online:2022-07-01 Published:2022-07-01

摘要: 推荐算法通过历史数据发现用户的兴趣偏好,在数据资源中寻找用户的偏好信息,并对用户进行推荐。目前,推荐系统中的协同过滤算法在各领域应用广泛,由于数据稀疏性和冷启动,使得推荐质量有所下降,为提升推荐精度,有学者从相似度方向进行研究。总结了推荐系统中最广泛使用的协同过滤算法,以及推荐系统中常用的传统相似度算法;对比分析了基于Pearson相关系数的相似度、余弦相似度、修正的余弦相似度等的适用场景;从冷启动和数据稀疏等方面分析了相似度的研究现状,研究表明通过混合相似度计算用户相似性,提高了推荐质量。最后,总结了相关文献在改进后存在推荐效率低、复杂度增高的问题,在提高推荐精度和推荐效率方面对相似度改进进行了展望。

关键词: 协同过滤, 相似度, 数据稀疏, 冷启动

Abstract: By studying the user’s preferences, the recommendation algorithm recommends the content of interest to users from massive data resources. Collaborative filtering algorithm is the most widely used recommendation algorithm. In order to improve the recommendation accuracy, many scholars have studied the similarity. This paper summarizes the most widely used collaborative filtering algorithm in recommendation system and the common similarity calculation methods in recommendation system. The similarity based on Pearson correlation coefficient, cosine similarity, modified cosine similarity and Jaccard similarity coefficient are compared and analyzed. The paper combs the research status of similarity from the aspects of cold start problem and data sparsity. The research shows that using mixed similarity to calculate the similarity between users improves the recommendation quality. It summarizes the problems of low recommendation efficiency and high complexity in the improved literature, and looks forward to the improvement of similarity in improving recommendation accuracy and efficiency.

Key words: collaborative filtering, similarity, sparse data, cold boot