Computer Engineering and Applications ›› 2008, Vol. 44 ›› Issue (36): 164-167.DOI: 10.3778/j.issn.1002-8331.2008.36.046

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Method of frequent pattern mining based on mapping hash table

CHEN Yin,SHAN Si-qing   

  1. Department of Management Information System,Beihang University,Beijing 100083,China
  • Received:2007-01-02 Revised:2008-05-22 Online:2008-12-21 Published:2008-12-21
  • Contact: CHEN Yin

采用映射哈希表的频繁模式挖掘方法

陈 茵,闪四清   

  1. 北京航空航天大学 信息管理与信息系统系,北京 100083
  • 通讯作者: 陈 茵

Abstract: Most of the algorithm researches in frequent pattern mining focused on the amelioration of algorithms on logic process,but explorations on physical storage mode when algorithms were running on computers received much less attention.In this article,a new storage structure used in frequent pattern mining based on FP-Tree is proposed.It uses a sequence storage structure and a hash table which is based on mappings of frequent items’ IDs as its header table.A corresponding frequent pattern mining algorithm is proposed.Experiments show that the new storage structure is better than FP-Tree,and the corresponding algorithm is much faster than FP-Growth.

Key words: association rule, frequent pattern, FP-Tree, FP-Growth, hash table, mapping

摘要: 大多数对频繁模式挖掘算法的研究都着眼于逻辑层面算法过程的改进,而对数据在计算机内存中的物理存储方式的探索相对较少。以FP-Tree存储结构和FP-Growth算法为基础,提出了FP-Tree头表的顺序存储方式,并在此基础上,利用基于频繁项ID映射的哈希表对FP-Tree的存储方式进行了改进,提出了与之相对应的频繁模式挖掘算法。实验结果表明该算法是快速和有效的。

关键词: 关联规则, 频繁模式, 频繁模式树, 频繁模式增长, 哈希表, 映射