计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (3): 112-119.DOI: 10.3778/j.issn.1002-8331.2007-0118

• 大数据与云计算 • 上一篇    下一篇

基于LSH的shapelets转换方法

丁智慧,乔钢柱,程谭,宿荣   

  1. 中北大学 大数据学院,太原 030051
  • 出版日期:2021-02-01 发布日期:2021-01-29

Shapelets Transform Method Based on LSH

DING Zhihui, QIAO Gangzhu, CHENG Tan, SU Rong   

  1. School of Data Science and Technology, North University of China, Taiyuan 030051, China
  • Online:2021-02-01 Published:2021-01-29

摘要:

针对基于shapelets转换的时间序列分类算法因shapelets候选集中存在大量相似序列而造成耗时过长的问题,提出了一种基于LSH的shapelets转换方法(Locality Sensitive Hashing Shapelets Transform,LSHST),提出一种局部敏感哈希函数(LSH)的改进算法,对原始子序列候选集进行逐级过滤筛选,快速挑选出形态上具有代表性的shapelets集合,计算集合中shapelets的质量,采用覆盖的方法确定将要进行转换的shapelets,进一步减小shapelets的数量,进行shapelets转换。实验表明,与Shapelet Transform(ST)、ClusterShapelets(CST)和Fast Shapelet Selection(FSS)算法相比,LSHST在分类精度上最高提升了20.05、19.9和16.52个百分点,在时间节省程度上最高达8 000倍、16 000倍和8.5倍。

关键词: 时间序列分类, shapelets转换, 局部敏感哈希

Abstract:

Aiming at the problem that the time series classification algorithm based on shapelets conversion is too time-consuming due to the existence of a large number of similar sequences in the shapelets candidate set, a LSH-based Shapelets Transformation method(Locality Sensitive Hashing Shapelets Transform, LSHST) is proposed. An improved algorithm of Local Sensitive Hash(LSH) function is proposed, and the original subsequence candidate set is filtered step by step. The quality of shapelets in the set is calculated, and the amount of shapelets is reduced by a method of coverage to determine the shapelets to be transformed, and finally shapelets is transformed. The experiments show that, compared with algorithms such as Shapelet Transform(ST), ClusterShapelet(CST), and Fast Shapelet Selection(FSS), LSHST improves the classification accuracy by 20.05, 19.95, and 16.52 percentage points, and the highest time savings 8,000 times, 16,000 times and 8.5 times.

Key words: time series classification, shapelets transform, Locality Sensitive Hashing(LSH)