Computer Engineering and Applications ›› 2013, Vol. 49 ›› Issue (11): 140-144.

Previous Articles     Next Articles

New method for data streams compress and storage online

FENG Xiulan, ZHANG Fan   

  1. School of Infomation Science and Technology, Beijing Forestry University, Beijing 100083, China
  • Online:2013-06-01 Published:2013-06-14

一种新的数据流在线压缩存储方法

冯秀兰,张  帆   

  1. 北京林业大学 信息学院,北京 100083

Abstract: The sampling storage method which is used in the current data stream ignores the?historical data?for?the analysis of data stream processing and?storage?management issues. For the problem, this paper presents a new processing method based on curve fitting. A weighted least-square principle is used to fit the cached stream data and better model description is obtained. The fitting results are analyzed by clustering algorithm, which serves as a classifier for polynomial fitting parameters. According to the clustering result, the appropriate window size will be given to fit the periodic stream data. Comparing the forecast result with the actual data, different methods are adopted to store data according to the comparison result. The experimental results indicate that the proposed method has good performance, can meet different processing requirements.

Key words: curve fitting, data stream, clustering algorithm, least-square principle

摘要: 针对当前数据流采用的抽样存储方法忽略了对数据流历史数据的分析处理与存储管理的问题,提出一种新的存储数据流的方法。在满足数据精度的情况下,采用加权最小二乘法对缓存数据流进行分段曲线拟合,对拟合结果进行聚类分析。根据聚类分析结果,采用合适的窗口对数据进行分段曲线拟合,利用拟合结果预测数据流的趋势。将预测结果与实际数据比较,根据比较结果采用不同的方法存储。实验结果表明,提出的方法具有良好的性能,能够满足不同的处理需求。

关键词: 曲线拟合, 数据流, 聚类算法, 最小二乘法