Computer Engineering and Applications ›› 2012, Vol. 48 ›› Issue (12): 24-28.

Previous Articles     Next Articles

Fuzzy clustering of interval data based on Wasserstein distances in data mining

LI Hong, SUN Qiubi   

  1. Department of Statistics, Management College, Fuzhou University, Fuzhou 350108, China
  • Online:2012-04-21 Published:2012-04-20

数据挖掘中区间数据模糊聚类研究
——基于Wasserstein测度

李  红,孙秋碧   

  1. 福州大学 管理学院 统计系,福州 350108

Abstract: Because of the limitations of the in-use distance in fuzzy clustering models for interval data, this paper puts forward the Wasserstein distances into interval data, and gets the adaptive single-index and adaptive double-index fuzzy clustering models. From the simulation results and CR index, the advantages of the model are proved. The model has strong meanings in empirical work when data is unstable and missing.

Key words: fuzzy clustering, interval data, symbolic data analysis, adaptive

摘要: 针对目前区间数据模糊聚类研究中区间距离定义存在的局限性,引入能够考虑区间数值分布特征的Wasserstein距离测度,提出基于Wasserstein距离测度的单指标和双指标自适应模糊聚类算法及迭代模型。通过仿真实验和CR指数,证实了该类模型的优势。该算法在海量、堆积如山的数据挖掘中有着重要的实践意义。

关键词: 模糊聚类, 区间数据, 符号数据分析, 自适应