Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (31): 128-131.

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Distributed constructing algorithm of frequent pattern tree based on grid

XUN Yaling,WU Xiaoting,ZHANG Jifu   

  1. School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-11-01 Published:2011-11-01

一种网格环境下的FP-树分布式构造算法

荀亚玲,吴晓婷,张继福   

  1. 太原科技大学 计算机科学与技术学院,太原 030024

Abstract: For FP-tree constructing and merging based on distributed computing platform,a distributed algorithm of constructing FP-tree(GridDBMA) is presented based on the grid.At first,the global item head table is made,then the local frequent pattern tree(BFP-tree) is constructed independently according to the order of the item head table in each node.The merge-
algorithm is used to unite the local frequent pattern trees into a global tree,which can extract the global frequent item sets.Through improving the traditional storage structures of frequent pattern tree,the size of the tree and the communication between nodes are reduced,the traversal of tree is more convenient and effective,and the mining efficiency of frequent item sets is improved.The experiments show the validity and effectiveness of the algorithm by using star spectral data set.

Key words: grid, distributed data mining, frequent pattern, association rule, FP-tree

摘要: 针对分布式环境下FP-tree的构造及合并,给出了一种网格环境下FP-tree的分布式构造算法GridDBMA。该算法中,各站点根据全局项目头表,独立构造局部频繁模式树BFP-tree,然后,利用合并算法将各局部树合并为一棵全局频繁模式树,并在全局频繁模式树上提取出所求的频繁项目集,通过对传统频繁模式树的存储结构的改进,减少了树的规模及站点间的网络通信量,并使树的遍历更加方便有效,提高了合并效率,从而提高了整个频繁项目集的挖掘效率。最后,采用天体光谱数据作为形式背景,实验验证了该算法的正确性和有效性。

关键词: 网格, 分布式数据挖掘, 频繁模式, 关联规则, FP-树