计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (19): 12-18.

• 热点与综述 • 上一篇    下一篇

基于互相关的二阶段时间序列聚类方法

高启航1,2,杨卫东1,2   

  1. 1.复旦大学 计算机科学技术学院,上海 201203
    2.上海市数据科学重点实验室(复旦大学),上海 201203
  • 出版日期:2016-10-01 发布日期:2016-11-18

Two-step clustering method of time series clustering based on cross-correlation

GAO Qihang1,2, YANG Weidong1,2   

  1. 1.School of Computer Science and Technology, Fudan University, Shanghai 201203, China
    2.Shanghai Key Laboratory of Data Science, Fudan University, Shanghai 201203, China
  • Online:2016-10-01 Published:2016-11-18

摘要: 提出了一种高效的时间序列聚类方法,以互相关函数为基础,通过二阶段的方法实现更低时间复杂度下的时间序列聚类。第一步以时间序列符号化为基础,通过设计符号化序列特征抽取算法,抽取特征时间段;第二步以互相关函数为基础,通过改进的互相关函数步骤,实现更快速的时间序列聚类。实验结果表明,该方法可以适应稀疏及密集的时间序列数据抽取,同时与传统的聚类距离公式相比,处理速度更快,对时间序列形状的缩放有更好的表示效果,并能保持较高准确性。

关键词: 时间序列聚类, 特征时间段抽取, 互相关函数

Abstract: Based on cross-correlation, an efficient, fast method is proposed for time series clustering and the time series clustering is realized by a two steps measure. The first step is based on symbolic of time series and extracts the characteristic time period by designing a characteristic extraction algorithm. The second step is based on cross-correlation, which realizes a faster time series clustering by adjusting the cross-correlation step. The experiments show that this method can fit sparse and dense time series data extraction. Comparing with traditional clustering distance measure, this method has high processing speed and can perform better on the stretch of time series shape. Meanwhile, this method keeps the accuracy in a high degree.

Key words: time series clustering, characteristic period extraction, cross-correlation