Computer Engineering and Applications ›› 2010, Vol. 46 ›› Issue (34): 136-139.DOI: 10.3778/j.issn.1002-8331.2010.34.041

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Arbitrary shape clustering for mixed attributes dataset

SU Xiao-ke1,2,LAN Yang3,CHENG Yao-dong4,WAN Ren-xia1   

  1. 1.College of Information Science and Technology,Donghua University,Shanghai 201620,China
    2.School of Computer and Communication Engineering,Zhengzhou University of Light Industry,Zhengzhou 450002,China
    3.School of Computer and Information Technology,Xinyang Normal University,Xinyang,Henan 464000,China
    4.Institute of High Energy Physics,Chinese Academy of Sciences,Beijing 100049,China
  • Received:2009-05-07 Revised:2010-07-08 Online:2010-12-01 Published:2010-12-01
  • Contact: SU Xiao-ke

可处理混合属性的任意形状聚类

苏晓珂1,2,兰 洋3,程耀东4,万仁霞1   

  1. 1.东华大学 信息科学与技术学院,上海 201620
    2.郑州轻工业学院 计算机与通信工程学院,郑州 450002
    3.信阳师范学院 计算机与信息技术学院,河南 信阳 464000
    4.中国科学院 高能物理研究所 计算中心,北京 100049

  • 通讯作者: 苏晓珂

Abstract: Clustering is a very active research branch in data mining field.The research about the arbitrary shape clustering is an open problem.In this paper an inter-cluster dissimilarity measure taking into account the frequency information of the categorical attribute values is introduced.An arbitrary shape clustering algorithm is proposed by defining the similarity degree between an object and a cluster.It can be used for the mixed attributes dataset.The experimental results on the synthetic and real-life datasets show that the proposed algorithm is feasible and effective comparing to other classical algorithms.

摘要: 聚类是数据挖掘中一个非常活跃的研究分支,任意形状的聚类则是一个有待研究的开放问题。提出一种包含分类属性取值频率信息的类间差异性度量和一种对象与类的相似度定义,在此基础上提出一种能处理任意形状的聚类算法,可处理混合属性数据集。在人造数据集和真实数据集上检验了提出的算法,并与相关算法进行了对比,实验结果表明,提出的算法是有效可行的。

CLC Number: