计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (25): 149-151.DOI: 10.3778/j.issn.1002-8331.2008.25.045

• 数据库、信号与信息处理 • 上一篇    下一篇

基于相异度矩阵的混合属性数据流聚类算法

万仁霞,王立新,刘振文   

  1. 东华大学 信息科学与技术学院,上海 201620
  • 收稿日期:2007-11-01 修回日期:2008-01-21 出版日期:2008-09-01 发布日期:2008-09-01
  • 通讯作者: 万仁霞

Novel algorithm for clustering heterogeneous data stream based on dissimilarity matrix

WAN Ren-xia,CHEN Jing-chao,WANG Li-xin   

  1. College of Information Science and Technology,Donghua University,Shanghai 201620,China
  • Received:2007-11-01 Revised:2008-01-21 Online:2008-09-01 Published:2008-09-01
  • Contact: WAN Ren-xia

摘要: 数据流的聚类是数据流挖掘的一个重要问题。提出一种针对混合属性的数据流聚类算法,它采用相异度来代替普通的聚类距离,并将等价相异度矩阵引入聚类过程。基于真实数据集的实验表明该算法比基地同类算法具有更好的聚类性能。

关键词: 数据流, 相异度, 聚类, 混合属性

Abstract: Data stream clustering is an important issue in data stream mining.In this paper,a novel algorithm is presented for clustering data stream with heterogeneous attributes.It adopts dissimilarity instead of the common clustering distance,and an equivalent dissimilarity matrix is used in the clustering process.Then the empirical evidence of this algorithm’s superiority over CluStream and HCluStream algorithms on the real data sets is given.

Key words: data stream, dissimilarity, cluster, heterogeneous attributes