Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (3): 132-134.DOI: 10.3778/j.issn.1002-8331.2011.03.040

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Semi-supervised clustering algorithm based on classifier

LI Shan,ZHANG Huaxiang   

  1. School of Information Science and Engineering,Shandong Normal University,Jinan 250014,China
  • Received:2009-05-07 Revised:2009-07-09 Online:2011-01-21 Published:2011-01-21
  • Contact: LI Shan

基于分类的半监督聚类方法

李 杉,张化祥   

  1. 山东师范大学 信息科学与工程学院,济南 250014
  • 通讯作者: 李 杉

Abstract: A semi-supervised clustering algorithm based on classifying is proposed.This algorithm classifies original data set roughly using a few labeled data,extends the method of selecting cluster center on the basis of traditional k-means clustering;then clusters data set generally with k-meansGuider method;Finally,the cluster results are integrated.The experimental results on UCI machine learning benchmark datasets show that this method can effectively improve the clustering performance.

Key words: semi-supervised learning, clustering, k-means clustering

摘要: 提出一种基于分类的半监督聚类算法。充分利用了数据集中的少量标记对象对原始数据集进行粗分类,在传统k均值算法的基础上扩展了聚类中心点的选择方法;用k-meansGuider方法对数据集进行粗聚类,在此基础上对粗聚类结果进行集成。在多个UCI标准数据集上进行实验,结果表明提出的算法能有效改善聚类质量。

关键词: 半监督学习, 聚类, k均值聚类

CLC Number: