计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (4): 176-178.

• 数据库与信息处理 • 上一篇    下一篇

一种基于FP-tree的频繁项集增量更新算法

廖仁全 王利华 邱江涛   

  1. 四川大学计算机学院 四川大学计算机学院 四川大学计算机学院
  • 收稿日期:2006-02-27 修回日期:1900-01-01 出版日期:2007-02-01 发布日期:2007-02-01
  • 通讯作者: 邱江涛

An Efficient Algorithm for Incremental Updating of Frequent Itemsets Based on FP-tree

  • Received:2006-02-27 Revised:1900-01-01 Online:2007-02-01 Published:2007-02-01

摘要: 本文针对频繁项集增量更新的问题,提出算法FIU。该算法将保存了数据库事务的FP-tree存储在磁盘上。当挖掘新支持度阈值的频繁项集时,只需从磁盘上读入FP-tree,再挖掘新支持度阈值下的频繁项集。当新增数据库事务记录后,首先建立新项目表,然后根据新项目表建立新增事务记录的FP-tree。读入存储在磁盘上的FP-tree,抽取出所有的事务记录,再插入到新FP-tree中,从而得到增量更新后的FP-tree。最后在增量更新后的FP-tree上挖掘频繁项集。实验证明,FIU算法执行时间不随数据库大小变化,与其他算法相比有较好的性能。

关键词: 数据挖掘, 关联规则, 频繁项集, 增量更新

Abstract: Incremental updating of frequent itemsets in a database includes three problems. An algorithm FIU is proposed for the three problems. Firstly, FP-tree which save transaction recorders of database is materialized on disk. Secondly, When frequent itemsets need be found with new support threshold, only work is to read FP-tree to memory from disk, then travel the tree to find itemsets under new support threshold. Thirdly, when new data insert into database, a new item table of database need be build, then a new FP-tree is built based on the new item table. All transaction records, which were saved in FP-tree on disk, are acquired, then insert into the new FP-tree. Finally, updated frequent itemsets may be found in the FP-tree.

Key words: data mining, association rules, frequent itemset, incremental updating