计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (10): 108-115.DOI: 10.3778/j.issn.1002-8331.2012-0457

• 网络、通信与安全 • 上一篇    下一篇

应用BWP指标的差分隐私保护k-means算法

张亚玲,屈玲玉   

  1. 西安理工大学 计算机科学与工程学院,西安 710048
  • 出版日期:2022-05-15 发布日期:2022-05-15

Differential Privacy Protection [k]-means Algorithm Based on BWP Index

ZHANG Yaling, QU Lingyu   

  1. School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China
  • Online:2022-05-15 Published:2022-05-15

摘要: 差分隐私是一种基于噪声扰动的隐私保护技术,针对差分隐私保护下噪声导致的聚类中心点偏移较大的问题,提出了一种基于BWP(between-within proportion)指标的差分隐私[k]-means算法。算法将聚类有效性评价指标BWP引入到隐私预算分配过程中,对传统隐私预算分配进行加权处理,在一次迭代中为不同密度分布的簇分配不同的隐私预算,从而添加不同的随机噪声。理论分析表明新算法满足[ε]-差分隐私保护。基于四个标准数据集对新算法进行了实验,实验结果表明,在聚类结果的可用性以及算法的稳定性上新算法具有优势。

关键词: 聚类, [k]-means算法, BWP指标, 差分隐私, 隐私预算分配

Abstract: Differential privacy is a privacy protection technology based on noise disturbance. In order to solve the problem of clustering center point deviation caused by noise under differential privacy protection, this paper proposes a differential privacy [k]-means algorithm based on BWP(between-within proportion) index. In this algorithm, BWP is introduced into the process of privacy budget allocation, and the traditional privacy budget allocation is weighted. In one iteration, different privacy budgets are allocated to the clusters with different density distributions, so as to add different random noises. Theoretical analysis shows that the new algorithm satisfies [ε]-differential privacy protection. The new algorithm is experimented on four standard data sets, and the experimental results show that the new algorithm has advantages in the availability of clustering results and the stability of the algorithm.

Key words: clustering, [k]-means algorithm, BWP index, differential privacy, privacy budget allocation