计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (16): 132-134.DOI: 10.3778/j.issn.1002-8331.2010.16.039

• 数据库、信号与信息处理 • 上一篇    下一篇

滑动窗口中数据流频繁项集挖掘方法

张月琴   

  1. 南京工业大学 电子与信息工程学院,南京 210009
  • 收稿日期:2009-10-09 修回日期:2009-12-25 出版日期:2010-06-01 发布日期:2010-06-01
  • 通讯作者: 张月琴

Algorithm for mining frequent itemsets from sliding window over data streams

ZHANG Yue-qin   

  1. College of Electronic and Information Engineering,Nanjing University of Technology,Nanjing 210009,China
  • Received:2009-10-09 Revised:2009-12-25 Online:2010-06-01 Published:2010-06-01
  • Contact: ZHANG Yue-qin

摘要: 根据数据流的流动性与连续性,提出了一种滑动窗口中频繁项集挖掘算法NSW,满足了人们快速获取最近到达数据中频繁项集的需求。该算法采用二进制矩阵表示滑动窗口中的事务列表,通过直接删除最老事务、不产生候选项集等方法控制时间和空间的开销。实验表明,该算法具有较好的时间和空间效率。

关键词: 数据挖掘, 数据流, 频繁项集, 滑动窗口, 矩阵

Abstract: According to the mobility and flowing of data streams,an algorithm called NSW is proposed to mine the frequent itemsets from a sliding window over data streams,and it meets the needs of people getting the frequent itemsets over datas arrived in recently.The binary matrix representation is adopted in the proposed algorithm to express the transaction list from a sliding window.The oldest transaction is deleted directly,and not generating the candidate itemsets but generating the frequent itemsets directly.These methods greatly control the space and time.The experiment results show that this algorithm has a good performance in speed and space.

Key words: data mining, data stream, frequent itemsets, sliding window, matrix

中图分类号: