计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (18): 165-167.

• 数据库与信息处理 • 上一篇    下一篇

分布式环境下保持隐私的聚类挖掘算法

张国荣1,印 鉴2   

  1. 1.广州美术学院 计算机基础教研室,广州 510260
    2.中山大学 信息科学与技术学院,广州 510275
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-06-21 发布日期:2007-06-21
  • 通讯作者: 张国荣

Privacy preserving clustering over distributed data

ZHANG Guo-rong1,YIN Jian2   

  1. 1.Computer Staff Room,Guangzhou Academy of Fine Arts,Guangzhou 510260,China
    2.School of Information Science & Technology,Sun Yat-sen University,Guangzhou 510275,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-06-21 Published:2007-06-21
  • Contact: ZHANG Guo-rong

摘要: 隐私保护是数据挖掘中一个重要的研究方向。针对如何在不共享精确数据的条件下,应用k-平均聚类算法从数据中发现有意义知识的问题,提出了一种基于安全多方计算的算法。算法利用半可信第三方参与下的安全求平均值协议,实现了在分布式数据中进行k-平均聚类挖掘时隐私保护的要求。实验表明算法能很好的隐藏数据,保护隐私信息,且对聚类的结果没有影响。

Abstract: Privacy preserving is an important direction for data mining research. This paper is concentrated on the issue of using k-means clustering algorithm to mining interesting accurate models without sharing precise individual data records,and proposes a method based on secure multi-party computation model. The method uses a secure_mean protocol to accomplish privacy-preserving k-means clustering that based on a semi-trusted third-party server. It efficiently hides attribute values,preserves privacy information and guarantees valid clustering results.