计算机工程与应用 ›› 2006, Vol. 42 ›› Issue (5): 15-.

• 博士论坛 • 上一篇    下一篇

流数据聚类模型变化检测策略.

刘赏,黄亚楼,倪维健   

  1. 南开大学信息学院
  • 收稿日期:2005-10-13 修回日期:1900-01-01 出版日期:2006-02-11 发布日期:2006-02-11
  • 通讯作者: 刘赏 sunnyLiu sunnyLiu

A Strategy for Detecting the Changes in Cluster Model of Data Stream

,,   

  1. 南开大学信息学院
  • Received:2005-10-13 Revised:1900-01-01 Online:2006-02-11 Published:2006-02-11

摘要: 流数据是动态的、不断发生变化的,如果能够及时发现流数据聚类模型的变化,并报告给用户发生了哪些变化,可以帮助用户制定出更好的策略。针对该需求,本文提出一种流数据变化检测策略,该策略充分利用簇统计信息CFT检测变化,比较变化后新聚类模型与原模型之间的差异,分别报告出每一个簇的具体变化,其时间复杂度为O(K2),实验证明该机制能够较为直观报告出变化的结果。

关键词: 流数据, 聚类, 数据挖掘

Abstract: As data stream is dynamic, its model will change with time. If the change in data stream’ model can be detected and reported detailedly in time, it will help the user make out better strategies. Aiming at this requirement, this paper proposed a new strategy based on CFT, by which not only the change can be found, but also the difference between the new cluster model and the old cluster model can be discovered in detail. The time complexity of the strategy is O(K2), and the experiments show that the change can be detected accurately and reported in a intuitive way.

Key words: data stream, cluster, data mining