计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (1): 9-22.DOI: 10.3778/j.issn.1002-8331.1809-0297

• 热点与综述 • 上一篇    下一篇

基于深度自动编码器的托攻击集成检测方法

郝耀军1,2,张付志2   

  1. 1.忻州师范学院 计算机系,山西 忻州 034000
    2.燕山大学 信息科学与工程学院,河北 秦皇岛 066004
  • 出版日期:2019-01-01 发布日期:2019-01-07

Ensemble Detection Method for Shilling Attacks Based on Deep Sparse Autoencoder

HAO Yaojun1,2, ZHANG Fuzhi2   

  1. 1.Department of Computer, Xinzhou Teachers University, Xinzhou, Shanxi 034000, China
    2.School of Information Science and Engineering, Yanshan University, Qinhuangdao, Hebei 066004, China
  • Online:2019-01-01 Published:2019-01-07

摘要: 在采用协同过滤技术的推荐系统中,恶意用户通过注入大量虚假概貌使系统的推荐结果产生偏离,达到其攻击目的。为了检测托攻击,根据用户的评分值或基于攻击时间的集中性假设,从不同视角提取攻击概貌的特征。但是,这些基于人工特征的检测方法严重依赖于特征工程的质量,而且人工提取的检测特征多限于特定类型的攻击,提取特征也需要较高的知识成本。针对这些问题,从用户评分项目的时间偏好信息入手,提出一种利用深度稀疏自动编码器自动提取检测特征的托攻击集成检测方法。利用小波变换将项目在不同时间间隔内的流行度设定为多个等级,对用户的评分数据预处理得到用户-项目时间流行度等级矩阵。然后,采用深度稀疏自动编码器对用户-项目时间流行度等级矩阵自动进行特征提取,得到用户评分模式的低层特征表达,消除了传统的人工特征工程。以SVM作为基分类器,在深度稀疏自动编码器的每层提取特征并进行攻击检测,生成最终的集成检测结果。在Netflix数据集上的实验表明,提出的检测方法对均值攻击、AoP攻击、偏移攻击、高级项目攻击、高级用户攻击具有较好的检测效果。

关键词: 协同过滤, 托攻击, 托攻击检测, 深度稀疏自动编码器, 项目时间流行度等级

Abstract: In collaborative filtering-based recommender systems, malicious users can bias the systems’ recommendation output by injecting a large number of fake profiles, and then achieve the purpose of attack. To detect shilling attacks, some researchers extract the features of attack profiles from different views, which are mainly based on the users’ratings or the hypothesis that attacks are concentrated in short time. However, the performance of feature extraction-based detection methods usually relies on the quality of artificial feature engineering. Moreover, the detection features are not universal in different environments, and the feature extraction requires high knowledge costs. To address these problems, this paper focuses on the user temporal preferences to the rated items, and proposes an ensemble detection method for shilling attacks based on deep sparse autoencoder. Firstly, the item popularity is set to several grades in different time intervals based on the wavelet transform, and the ratings are preprocessed to obtain the user-item temporal popularity grade matrix. Secondly, the deep sparse autoencoder is used to automatically extract the features from user-item temporal popularity grade matrix, which can obtain the low level feature expressions for the user rating patterns and eliminate the artificial feature engineering. Finally, as base classifier, SVM is used to detect the attacks based on the features of each layer in deep sparse autoencoder, and then the final detection result is generated by voting the detection results of each layer. Experimental results on the Netflix dataset indicate that the proposed method has better detection performance under average attack, AoP attack, shifting attack, power item attack, and power user attack.

Key words: collaborative filtering, shilling attacks, shilling attack detection, deep sparse autoencoder, item temporal popularity grade