Cluster-center-distance maximization clustering with knowledge transfer

Abstract

Abstract: Traditional clustering algorithms are prone to being failure in two cases: The data are quite sparse or distorted by plenty of noise or outliers; To proportionally scale raw data in order to control the difference existing in eventual data. To address these issues, this paper first devises the history knowledge transfer as well as the maximum cluster-center-distance mechanisms, and then, combining these two mechanisms with the classical Maximum Entropy Clustering（MEC） approach, this paper proposes the center distance maximization clustering with historical knowledge transfer（HKT-CDMC for short）. In general, the major merits of HKT-CDMC are three-fold: Benefiting from the guidance of historical knowledge, HKT-CDMC proves high effectiveness in the situations where the data are insufficient or distorted by much noise; After data scaling, the cluster centers obtained by those classical clustering methods are likely to be too close, HKT-CDMC, however, can effectively avoid this phenomenon via the maximum cluster-center-distance mechanism; As the historical knowledge cannot be mapped inversely into the raw data, HKT-CDMC is of good capability of privacy protection for the source domain. The experimental studies on both artificial and real-world datasets demonstrated these merits of our work.

Key words: transfer learning, historical knowledge, maximum cluster-center-distance, privacy protection, fuzzy clustering

摘要： 传统的聚类算法在以下两种情况下存在直接失效的风险：一是数据稀少或存在大量干扰数据；二是为了调控数据间的差异性，对数据集进行缩放。为了同时解决上述两个问题，提出了历史知识迁移准则与中心间距极大化准则，并将其运用到极大熵聚类算法中，称之为具备历史迁移能力的中心极大化聚类算法。算法有三大突出的优点：在当前数据稀少或存在污染时，算法有效利用了历史知识进行迁移学习，从而证明了较好的聚类有效性；在数据缩放到一定倍数时，传统聚类算法取得的类中心趋于一致，而算法利用类中心间距极大化准则，有效避免了类中心一致的问题；算法所利用的历史知识均不暴露历史源数据，因此算法具有良好的历史数据隐私保护效果。通过模拟数据集和真实数据集的实验，验证了算法的上述优点。

关键词: 迁移学习, 历史知识, 类中心间距极大, 隐私保护, 模糊聚类

SUN Shouwei, QIAN Pengjiang, CHEN Aiguo, JIANG Yizhang. Cluster-center-distance maximization clustering with knowledge transfer[J]. Computer Engineering and Applications, 2016, 52(16): 149-155.

孙寿伟，钱鹏江，陈爱国，蒋亦樟. 具备迁移能力的类中心距离极大化聚类算法[J]. 计算机工程与应用, 2016, 52(16): 149-155.

[1]	SANG Jianghui, JIANG Haiyan. Multi-label Transfer Learning Algorithm Based on Joint Distribution Alignment [J]. Computer Engineering and Applications, 2021, 57(9): 154-161.
[2]	XU Degang, WANG Lu, LI Fan. Review of Typical Object Detection Algorithms for Deep Learning [J]. Computer Engineering and Applications, 2021, 57(8): 10-25.
[3]	XU Kewen, XU Bo, WU Ying, XU Haoran. Overview of Application of Machine Learning in Ultrasound Images [J]. Computer Engineering and Applications, 2021, 57(4): 11-17.
[4]	XU Zhijing, WANG Yi. Glaucoma Fundus Images Classification Method Based on Transfer Learning [J]. Computer Engineering and Applications, 2021, 57(3): 144-149.
[5]	YAO Kexin, CAO Weiqun. Trans-Net：Stick Figure Recognition Based on Transfer Learning [J]. Computer Engineering and Applications, 2021, 57(3): 182-188.
[6]	GAO Shuang, XU Qiaozhi. Review of Application of Transfer Learning in Medical Image Field [J]. Computer Engineering and Applications, 2021, 57(24): 39-50.
[7]	HUANG Yinglai, AI Xin. Research on Classification of Corn Leaf Disease Image by Improved Residual Network [J]. Computer Engineering and Applications, 2021, 57(23): 178-184.
[8]	WEI Lifei, LI Mengsi, ZHANG Lei, CHEN Congcong, CHEN Yujiao, WANG Qin. Privacy-Preserving Linear Regression Algorithm Based on Secure Two-Party Computation [J]. Computer Engineering and Applications, 2021, 57(22): 139-146.
[9]	HUANG Zeying, LI Haiyan, LIN Jingliang. Surrogate Model Construction Method of Extreme Learning Machine Based on Transfer Learning and Application [J]. Computer Engineering and Applications, 2021, 57(22): 257-262.
[10]	HE Zhiming, XU Yida. Electronic Medical Record Sharing Scheme Based on Blockchain and Searchable Encryption [J]. Computer Engineering and Applications, 2021, 57(21): 140-147.
[11]	ZHOU Shaoguang, WU Hao, ZHAO Chanjuan, CHEN Renxi. Transfer Learning for Hyperspectral Image Classification Using Homogeneous Area Characteristics [J]. Computer Engineering and Applications, 2021, 57(21): 224-233.
[12]	XU Jian, HUANG Lei, CHEN Qianqian, LU Zhen, WU Shupei. Research on Pedestrian Gait Recognition Based on Multi-scale Feature Transfer Learning [J]. Computer Engineering and Applications, 2021, 57(20): 180-187.
[13]	LI Ying. Review of Application of Transfer Learning in Medical Image Analysis [J]. Computer Engineering and Applications, 2021, 57(20): 42-52.
[14]	XIE Yuqing, WANG Yuan, JIANG Ying, YANG Miao, WANG Yongli. Privacy Protection Method Facilitating Data Sharing for Grid Manufacturing Data Lake [J]. Computer Engineering and Applications, 2021, 57(2): 113-118.
[15]	LYU Xin, ZHAO Liancheng, YU Jiyuan, TAN Bin, ZENG Tao, CHEN Juan. Trajectory-Clustering Based Privacy Protection Method for Continuous Query in LBS [J]. Computer Engineering and Applications, 2021, 57(2): 104-112.

Cluster-center-distance maximization clustering with knowledge transfer

具备迁移能力的类中心距离极大化聚类算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics