计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (3): 68-73.DOI: 10.3778/j.issn.1002-8331.1906-0417

• 大数据与云计算 • 上一篇    下一篇

基于聚类和用户偏好的协同过滤推荐算法

王卫红,曾英杰   

  1. 浙江工业大学 计算机科学与技术学院,杭州 310023
  • 出版日期:2020-02-01 发布日期:2020-01-20

Collaborative Filtering Recommendation Algorithm Based on Clustering and User Preference

WANG Weihong, ZENG Yingjie   

  1. College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China
  • Online:2020-02-01 Published:2020-01-20

摘要: 协同过滤推荐算法使用评分数据作为学习的数据源,针对协同过滤推荐算法中存在的评分数据稀疏以及算法的可拓展性问题,提出了一种基于聚类和用户偏好的协同过滤推荐算法。为了挖掘用户的偏好,该算法引入了用户对项目类型的平均评分到评分矩阵中,并加入了基于用户自身属性的相似度;同时,为了降低数据稀疏性,该算法使用Weighted Slope One算法填充评分数据中的未评分项,并通过融入密度和距离优化初始聚类中心的K-means算法聚类填充后的评分数据中的用户,缩小了相似用户的搜索空间;最后在聚类后的数据集中使用传统的协同过滤推荐算法生成目标用户的推荐结果。通过使用MovieLens100K数据集实验证明,提出的算法对推荐效果有所改善。

关键词: 协同过滤, k-means聚类, 用户偏好, 相似度, Weighted Slope One算法

Abstract: The collaborative filtering recommendation algorithm uses rating data as the data source for learning. Aiming at the sparse rating data and the scalability of the collaborative filtering recommendation algorithm, a collaborative filtering recommendation algorithm based on clustering and user preferences is proposed. In order to mine users’ preferences, this algorithm introduces the average score of user to item type into the score matrix, and adds the similarity based on user’s own attributes. At the same time, in order to reduce data sparsity, the Weighed Slope One algorithm is used to fill the unrated items in the rating data, and the K-means algorithm based on density and distance optimization of initial clustering center is used to cluster the users in the filled rating data, which reduces the search space of similar users. Finally, the traditional collaborative filtering recommendation algorithm is used in the clustered data set to generate the recommendation results of the target users. By using the MovieLen100K dataset, the experiment shows that the paposed algorithm improves the recommendation effect.

Key words: collaborative filtering, k-means clustering, user preference, similarity, Weighted Slope One algorithm