计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (24): 151-153.DOI: 10.3778/j.issn.1002-8331.2010.24.045

• 数据库、信号与信息处理 • 上一篇    下一篇

高置信度关联规则的挖掘

周贤善1,杜友福1,邵世煌2,余光柱2   

  1. 1.长江大学 计算机科学学院,湖北 荆州 434000
    2.东华大学 信息学院,上海 201600
  • 收稿日期:2010-03-30 修回日期:2010-06-03 出版日期:2010-08-21 发布日期:2010-08-21
  • 通讯作者: 周贤善

Mining high confidence association rules

ZHOU Xian-shan1,DU You-fu1,SHAO Shi-huang2,YU Guang-zhu2   

  1. 1.College of Computer Science and Technology,Changjiang University,Jingzhou,Hubei 434000,China
    2.College of Information Science and Technology,Donghua University,Shanghai 201600,China
  • Received:2010-03-30 Revised:2010-06-03 Online:2010-08-21 Published:2010-08-21
  • Contact: ZHOU Xian-shan

摘要: 传统的关联规则和基于效用的关联规则,会忽略一些支持度或效用值不高、置信度(又称可信度)却非常高的规则,这些置信度很高的规则能帮助人们满足规避风险、提高成功率的期望。为挖掘这些低支持度(或效用值)、高置信度的规则,提出了HCARM算法。HCARM采用了划分的方法来处理大数据集,利用新的剪枝策略压缩搜索空间。同时,通过设定长度阈值minlen,使HCARM适合长模式挖掘。实验结果表明,该方法对高置信度长模式有效。

Abstract: Both traditional association rule mining and utility based association rule mining may neglect those rules whose support or utility is not high.Although these rules’ support or utility is not very high,they can satisfy those people whose main goal is to avoid risks or raise the rate of success.In order to mine the rules with a low support(or utility)and a high confidence,this paper proposes a new algorithm:HCARM.HCARM adopts partition method to handle large data,and prune out candidates by using new pruning strategy.In the meantime,by giving a proper length threshold minlen,HCARM can be fitter for long patterns mining.Experiments on synthetic data show that the method can get a good performance in mining high confidence long patterns.

中图分类号: