Computer Engineering and Applications ›› 2018, Vol. 54 ›› Issue (8): 85-90.DOI: 10.3778/j.issn.1002-8331.1612-0040

Previous Articles     Next Articles

Variable-length clustering-based approach for personalization anonymization protection

LI Dan, LING Jie   

  1. Faculty of Computer, Guangdong University of Technology, Guangzhou 510006, China
  • Online:2018-04-15 Published:2018-05-02


李  丹,凌  捷   

  1. 广东工业大学 计算机学院,广州 510006

Abstract: For the privacy disclosure caused by linking attack, while in order to reduce the information loss of anonymous protection and improve the availability of published data sets, this paper presents a personalization anonymous anonymization protection method based on variable-length clustering for individual. The method fully considers the record weights on the result of clustering cluster-center to achieve better availability of data, and classification processing of sensitive attribute values. The sensitive attribute values are divided into three levels, meeting the protection requirements of different individuals. Theoretical analysis and experimental results show that the method can meet the personalized requirements to protect sensitive attributes, effectively reducing the loss of information at the same time, and with high efficiency. The generated anonymous data set has good availability.

Key words: privacy protection, data publishing, variable-length clustering, personalization anonymization, information loss

摘要: 针对链接攻击导致的隐私泄露问题,以及为了尽可能减少匿名保护时产生的信息损失,提高发布数据集的可用性,提出一种面向个体的基于变长聚类的个性化匿名保护方法。该方法充分考虑记录权重值对聚类簇中心结果的影响,以提高数据的可用性,并对敏感属性值进行分级处理,将敏感属性值分成三个等级类,响应不同个体的保护需求。理论分析和实验结果表明,该方法能满足敏感属性个性化保护需求,同时可有效地降低信息损失,效率较高,生成的匿名数据集具有较好的可用性。

关键词: 隐私保护, 数据发布, 变长聚类, 个性化匿名, 信息损失