计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (23): 16-20.

• 博士论坛 • 上一篇    下一篇

一种基于聚类的个性化(l,c)-匿名算法

王平水1,2,王建东1   

  1. 1.南京航空航天大学 计算机科学与技术学院,南京 210016
    2.安徽财经大学 管理科学与工程学院,安徽 蚌埠 233030
  • 出版日期:2012-08-11 发布日期:2012-08-21

Personalized(l,c)-anonymity algorithm based on clustering

WANG Pingshui1,2, WANG Jiandong1   

  1. 1.College of Computer Science and Technology, Nanjing University of Aeronautics & Astronautics, Nanjing 210016, China
    2.School of Management Science and Engineering, Anhui University of Finance & Economics, Bengbu, Anhui 233030, China
  • Online:2012-08-11 Published:2012-08-21

摘要: 目前多数l-多样性匿名算法对所有敏感属性值均作同等处理,没有考虑其敏感程度和具体分布情况,容易受到相似性攻击和偏斜性攻击;而且等价类建立时执行全域泛化处理,导致信息损失较高。提出一种基于聚类的个性化[(l,c)]-匿名算法,通过定义最大比率阈值和不同敏感属性值的敏感度来提高数据发布的安全性,运用聚类技术产生等价类以减少信息损失。理论分析和实验结果表明,该方法是有效和可行的。

关键词: 数据发布, 隐私保护, l-多样性, 相似性攻击, 偏斜性攻击

Abstract: At present most [l]-diverse anonymity algorithms are vulnerable to similarity attack and skewness attack due to treating all sensitive attribute values equally and without considering the sensitivity and specific distribution. Moreover, these algorithms result in high information loss on account of performing full domain generalization to create equivalence class. This paper proposes a personalized [(l,c)]-anonymity algorithm based on clustering, which improves the security through defining sensitivity for different sensitive attribute value and maximal ratio threshold and reduces information loss via clustering technique. Theoretical analysis and experimental results indicate that the method is effective and feasible.

Key words: data release, privacy preservation, [l]-diversity, similarity attack, skewness attack