Computer Engineering and Applications ›› 2008, Vol. 44 ›› Issue (10): 138-141.

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Effective similarity definition and clustering through time series evolution analysis

ZHOU Yuan-bing,ZUO Xin-qiang,GU Jie,ZHAO Chun-hui   

  1. State Power Economic Research Institute,Beijing 100761,China
  • Received:2007-07-23 Revised:2007-10-24 Online:2008-04-01 Published:2008-04-01
  • Contact: ZHOU Yuan-bing

基于时间序列演变分析的有效相似性定义和聚类

周原冰,左新强,顾 杰,赵春晖   

  1. 国网北京经济技术研究院,北京 100761
  • 通讯作者: 周原冰

Abstract: Time series is one of the most widely-used data in business applications,e.g.power load sequence,web log etc.It is very important to mine time series for supporting decision-making.Especially,determining the similarity of time series plays a key part in various problems,e.g.analyzing the features of eletricity demand for each district.The previous methods,in the content of managing and mining data,hardly or do not enough use the evolution specialty of time series to measure similarity.This paper proposes an unexplored and effective approach based on evolution analysis of time series,and this approach quantifies the evolution trend to construct effective similarity definition,termed Similarity with Evolution Analysis(SEA).The clustering strategy based on SEA is also provided.The superior experimental results of compared methods on real data sets demonstrate the effectiveness of the method proposed,and thus imply the important significance of evolution analysis for similarity measure of time series.

Key words: time series, similarity definition, evolution analysis, clustering

摘要: 时间序列广泛存在于商业应用中,比如电力负荷序列、网络日志等。挖掘时间序列数据对决策分析非常重要,特别地,决定时间序列的相似性在各种实际问题中起关键的作用,比如分析各个区域的电力需求特征。以前的相似性度量方法从未使用过演变这种特性去度量时间序列的相似性,基于演变分析提出了有效的时间序列相似性度量方法(SEA),该方法通过量化演变趋势构建了有效的相似性定义,并且提出了基于该方法的聚类策略。通过在实际数据集上和其它方法的实验比较,证明了提出方法的有效性,因此也证明了时间序列演变分析对相似性度量的重要意义。

关键词: 时间序列, 相似性定义, 演变分析, 聚类