计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (10): 150-153.

• 数据库、信号与信息处理 • 上一篇    下一篇

在单向FP-tree上挖掘频繁闭项集

王现君1,宋晶晶2,姜保庆1   

  1. 1.河南大学 数据与知识工程研究所,河南 开封 475004
    2.清远职业技术学院 信息科技学院,广东 清远 511510
  • 收稿日期:2007-07-13 修回日期:2007-10-12 出版日期:2008-04-01 发布日期:2008-04-01
  • 通讯作者: 王现君

Mining frequent closed itemsets in unidirectional FP-tree

WANG Xian-jun1,SONG Jing-jing2,JIANG Bao-qing1   

  1. 1.Institute of Data and Knowledge Engineering,Henan University,Kaifeng,Henan 475004,China
    2.School of Information & Science,Qingyuan Polytechnic,Qingyuan,Guangdong 511510,China
  • Received:2007-07-13 Revised:2007-10-12 Online:2008-04-01 Published:2008-04-01
  • Contact: WANG Xian-jun

摘要: 频繁闭项集提供了频繁项集的一种完整的、最小表示。针对稠密数据集,提出一种基于单向FP-tree的频繁闭项集挖掘算法Unid_FP-FCI。该算法在挖掘过程中只生成被约束子树,而它是一种虚拟的树结构,在原有的单向FP-tree基础上用三个很小的数组来表示,因而避免了以往算法需递归构造条件FP-tree来计算频繁闭项集的弊端,极大地降低了内存空间和时间开销,提高了挖掘效率。

关键词: 数据挖掘, 频繁项集, 频繁闭项集, 单向FP-tree, 被约束子树

Abstract: Frequent closed itemsets provide a minimal representation of frequent itemsets without losing their support information.This paper proposes an efficient algorithm Unid_FP-FCI for mining the complete set of frequent closed itemsets in a unidirectional FP-tree.Because in process of mining only generate constrained sub-trees consisting of three small arrays,which is pseudo tree structure based on the originally unidirectional FP-tree,the flaw is avoided in former algorithms which need to generate lots of conditional FP-trees for finding frequent closed itemsets recursively.Reducing the space and time consumption to a great extent,then the algorithm improve mining efficiency.

Key words: data mining, frequent itemset, frequent closed itemset, unidirectional FP-tree, constrained sub-tree