Computer Engineering and Applications ›› 2013, Vol. 49 ›› Issue (2): 152-154.

Previous Articles     Next Articles

Efficient frequent pattern tree structure over data streams

TAN Jun1,2, BU Yingyong1, CHEN Aibin2   

  1. 1.College of Mechanical & Electrical Engineering, Central South University, Changsha 410083, China
    2.College of Computer Science and Information Technology, Central South University of Forestry and Technology University, Changsha 412006, China
  • Online:2013-01-15 Published:2013-01-16

数据流上一种单遍扫描频繁模式树结构

谭  军1,2,卜英勇1,陈爱斌2   

  1. 1.中南大学 机电工程学院,长沙 410083
    2.中南林业科技大学 计算机与信息工程学院,长沙 412006

Abstract: Aiming at the problem that FP-growth algorithm can not adapt to the data stream with the characteristics of infinity and fluidity, this paper presents a novel variation structure of FP-tree called FPS-tree, which captures all database information in current window with one scan. For effectively deleting expired panes in sliding window, a novel concept-“tail-node” is proposed, so the information on pane in every path of FPS-tree is only retained in the tail-node. Experimental results show that compact performance of the FPS-tree is better than other prefix-tree structure with one scan.

Key words: data stream, Frequent Pattern(FP)-growth algorithm, single-pass pattern tree, tail-node

摘要: 针对频繁模式增长算法无法适应数据流的无限性和流动性的特点,提出一种新颖的FP-tree的变形结构——FPS-tree,只需单遍扫描便能获取当前窗口的全部数据库信息。为了在滑动窗口时有效地删除过期窗格和插入新窗格,提出一个新颖的概念——“尾结点”,FPS-tree中每条路径上的窗格信息只保持在尾结点里。实验结果表明FPS-tree的压缩性能要优于其他单遍扫描的前缀树结构。

关键词: 数据流, 频繁模式增长算法, 单遍扫描模式树, 尾结点