Computer Engineering and Applications ›› 2008, Vol. 44 ›› Issue (34): 142-144.DOI: 10.3778/j.issn.1002-8331.2008.34.044

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Fast algorithm for mining frequent itemsets over data streams

XU Jian-min1,2,HAO Li-wei1,WANG Yu1   

  1. 1.College of Mathematic and Computer Science,Hebei University,Baoding,Hebei 071002,China
    2.Institute of Systems Engineering,Tianjin University,Tianjin 300072,China
  • Received:2008-06-03 Revised:2008-09-04 Online:2008-12-01 Published:2008-12-01
  • Contact: XU Jian-min

数据流频繁项集的快速挖掘方法

徐建民1,2,郝丽维1,王 煜1   

  1. 1.河北大学 数学与计算机学院,河北 保定 071002
    2.天津大学 系统与工程研究所,天津 300072
  • 通讯作者: 徐建民

Abstract: Recently,data streams mining has become a research hotspot at home and abroad,while mining frequent itemsets is an important problem in the data streams mining.According to the features of the data streams which is limitless and mobility,an algorithm called FIM-SW is proposed to mine the frequent itemsets over the sliding window.The vertical database representation is adopted in the proposed algorithm,each item is represented by bitvector,and the Apriori property is used to get frequent itemsets.The experimental results show that it improves the efficiency for mining observably.

Key words: data mining, data stream, frequent itemset, sliding window

摘要: 近年来,数据流挖掘一直是国内外研究的热点,频繁项集挖掘又是数据流挖掘中的重要问题。根据数据流无限性和流动性的特点,提出了一种在滑动窗口中挖掘频繁项集的算法FIM-SW,FIM-SW算法主要是采用垂直的数据库表示方法,使用二进制向量表示每个数据项,并利用Apriori性质产生频繁项集。实验结果表明,这种算法显著地提高了挖掘效率。

关键词: 数据挖掘, 数据流, 频繁项集, 滑动窗口