计算机工程与应用 ›› 2015, Vol. 51 ›› Issue (21): 1-5.

• 博士论坛 • 上一篇    下一篇

改进的使用非负矩阵分解的隐私保护分类方法

李  光,惠  萌   

  1. 长安大学 电子与控制工程学院,西安 710064
  • 出版日期:2015-11-01 发布日期:2015-11-16

Improved privacy-preserving classification method using non-negative matrix factorization

LI Guang, XI Meng   

  1. School of Electronic and Control Engineering, Chang’an University, Xi’an 710064, China
  • Online:2015-11-01 Published:2015-11-16

摘要: 针对现有的基于非负矩阵分解的隐私保护数据挖掘方法中,不区分样本的重要性的不同,对所有样本都进行同样强度扰动的问题进行改进。提出了一种结合样本选择的基于非负矩阵分解的隐私保护分类方法。该方法使用样本选择将原始样本区分为重要的和不重要的两类。在对数据进行扰动时,使用现有的基于非负矩阵分解的方法对所有样本进行扰动。随后利用非负矩阵分解的聚类性质,对不重要的样本进行附加扰动。实验表明,该方法在保持数据可用性的同时,可以对隐私信息提供更好的保护。

关键词: 隐私保护, 数据挖掘, 数据扰动, 非负矩阵分解, 样本选择, 分类

Abstract: In existing privacy-preserving data mining method based on NMF(Non-Negative Matrix Factorization), every sample is equally important, and is perturbed with same degree. For solving this problem, a new method using sample selection and based on NMF is proposed. This method uses sample selection to divide samples into two parts: the important samples and the unimportant samples. Both important and unimportant samples are perturbed by using the existing NMF-based method. And then, unimportant samples are perturbed additionally by using the clustering properties of the NMF. The experiments show that, when keeping data utility, this new method can protect privacy well.

Key words: privacy protection, data mining, data perturbation, non-negative matrix factorization, sample selection, classification