Computer Engineering and Applications ›› 2018, Vol. 54 ›› Issue (18): 250-255.DOI: 10.3778/j.issn.1002-8331.1706-0133

Previous Articles     Next Articles

Algorithm for time series based on important points

SUN Zhiwei, DONG Liangliang, MA Yongjun   

  1. College of Computer Science and Information Engineering, Tianjin University of Science and Technology, Tianjin 300222, China
  • Online:2018-09-15 Published:2018-10-16

一种基于重要点的时间序列分段算法

孙志伟,董亮亮,马永军   

  1. 天津科技大学 计算机科学与信息工程学院,天津 300222

Abstract: The time series linear segmentation algorithm based on important points can preserve the global characteristics of time series and the high precision of fitting. The traditional time series segmentation algorithm based on important points needs to specify the parameters such as the error threshold. These parameters are related to the original data, which is not convenient for the user, and the fitting effect needs to be further improved. In order to solve this problem, this paper proposes a  segmentation algorithm, PLR_TSIP, which is based on the important point of the time series. The method first takes into account the size of the overall fitting error and the length of the sequence, followed by pre-segmentation for the higher priority segment to find the optimal segment. Finally, taking into account the segmentation of the maximum point and the minimum point of the same relationship in the segmentation, which can be a number of important points of the division. Compared with the traditional segmentation algorithm, the fitting error is reduced by the experimental analysis of multiple data sets, and the fitting effect is improved. Compared with the important point segmentation algorithm, at the same time, the efficiency of the segmentation is improved.

Key words: time series, important point, piecewise linear representation, fitting error

摘要: 基于重要点的时间序列线性分段算法能在较好地保留时间序列的全局特征的基础上达到较好的拟合精度。但传统的基于重要点的时间序列分段算法需要指定误差阈值等参数进行分段,这些参数与原始数据相关,用户不方便设定,而且效率和拟合效果有待于进一步提高。为了解决这一问题,提出一种基于时间序列重要点的分段算法——PLR_TSIP,该方法首先综合考虑到了整体拟合误差的大小和序列长度,接着针对优先级较高的分段进行预分段处理以期找到最优的分段;最后在分段时考虑到了分段中最大值点和最小值点的同异向关系,可以一次进行多个重要点的划分。通过多个数据集的实验分析对比,与传统的分段算法相比,减小了拟合误差,取得了更好的拟合效果;与其他重要点分段算法相比,在提高拟合效果的同时,较大地提高了分段效率。

关键词: 时间序列, 重要点, 分段线性表示, 拟合误差