Computer Engineering and Applications ›› 2007, Vol. 43 ›› Issue (5): 182-184.

• 数据库与信息处理 • Previous Articles     Next Articles

Efficient approach to analyse correlations of distributed data streams

  

  • Received:2006-03-16 Revised:1900-01-01 Online:2007-02-11 Published:2007-02-11

一种分布式数据流相关性分析的有效方法

程国达 杨小宁 谢岳   

  1. 南京财经大学 南京财经大学
  • 通讯作者: 程国达

Abstract: In distributed data streams, the correlation analysis between data streams can be used to detect inner-relation of monitored objects. The paper presented an algorithm based on basic windows to measure correlations coefficient over distributed data streams. Formula for computing correlation coefficient was changed into so form which consisted of some factors that could be aggregated conveniently by method based on basic windows. Then factors among formula were counted respectively. In approach based on basic window, the data items among window were partitioned into a serial of basic windows, which were aggregated each. After window had been slidden random, previous partial results could be used to compute current window. Compared with aggregation on whole data items within window, the simulative experiment shows that the presented method can reduce the computing time of correlation coefficient on data streams.

摘要: 在分布式数据流中,数据流之间相关性分析可以揭示被监测对象之间存在的内在联系。提出了一个基于基窗口的相关系数的计算方法,该方法先将计算相关系数的公式变形为由适合基窗口聚集的因子组成,然后用基于基窗口的方法聚集每个因子。基于基窗口的聚集方法是将窗口中的数据项划分成一系列基窗口并分别对基窗口进行计算。当窗口随机滑动后,新窗口中数据项的聚集可以部分地利用上一次窗口聚集的结果。模拟实验表明,与每次对窗口中所有数据进行聚集相比,基于基窗口的方法可以有效地降低数据流相关系数的计算时间