计算机工程与应用 ›› 2009, Vol. 45 ›› Issue (33): 125-127.DOI: 10.3778/j.issn.1002-8331.2009.33.041

• 数据库、信号与信息处理 • 上一篇    下一篇

高激励项集的挖掘研究

余光柱1,3,刘旭辉2,邵世煌1   

  1. 1.东华大学 信息科学与技术学院,上海 201600
    2.长江大学 机械工程学院,湖北 荆州 434000
    3.湖北警官学院 计算机系,武汉 430034
  • 收稿日期:2008-07-02 修回日期:2008-08-03 出版日期:2009-11-21 发布日期:2009-11-21
  • 通讯作者: 余光柱

Study of high motivation itemsets mining

YU Guang-zhu1,3,LIU Xu-hui2,SHAO Shi-huang1   

  1. 1.College of Information Science and Technology,Donghua University,Shanghai 201600,China
    2.College of Mechanical Engineering,Yangtze University,Jingzhou,Hubei 434000,China
    3.Department of Computer Science,Hubei University of Police,Wuhan 430034,China
  • Received:2008-07-02 Revised:2008-08-03 Online:2009-11-21 Published:2009-11-21
  • Contact: YU Guang-zhu

摘要: 基于支持度的关联规则只能找出所有的频繁集,无法找到那些非频繁但效用很高的项集;基于效用的关联规则致力于发现所有高效用项集,无法找到效用不高但支持度与效用的积很大的项集。为克服支持度与效用的不足,提出了一种新的项集重要性的度量方法(即激励)及一种自下而上的挖掘高激励项集的算法HM-Two-Phase-Miner。激励集成了支持度与效用的优点,能同时表达项集的语义特性与统计特性。HM-Two-Phase-Miner利用事务权重激励向下封闭特性进行减枝,有效提高了算法的性能。

关键词: 高激励项集, 关联规则, 支持度, 基于效用

Abstract: Algorithms for support-based association rules mining can only discover frequent itemsets,but can not discover the non-frequent itemsets with high utility values;Utility-based association rules mining aims at discovering high utility itemsets,without considering the itemsets whose utility values are not high but the product of the support and utility of the same itemset is very large.To solve the problem,a new measure is proposed,i.e.,motivation,to measure the importance of an itemset and a down-top algorithm called HM-Two-Phase-Miner to discover high motivation itemsets.Motivation integrates the advantages of support and utility,and thus can reflect both the semantic significance and statistical significance of an itemset.In HM-Two-Phase-Miner algorithm,transaction-weighted motivation downward closure property is adopted to cut down the search space.

Key words: high motivation itemset, association rule, support, utility-based

中图分类号: