计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (25): 168-171.

• 数据库与信息处理 • 上一篇    下一篇

基于规则兴趣度的关联分类

王熙照,赵东垒   

  1. 河北大学 数学与计算机学院,河北 保定 071002
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-09-01 发布日期:2007-09-01
  • 通讯作者: 王熙照

Associative classification based on interestingness of rules

WANG Xi-zhao,ZHAO Dong-lei   

  1. Department of Mathematics and Computer Science,Hebei University,Baoding,Hebei 071002,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-09-01 Published:2007-09-01
  • Contact: WANG Xi-zhao

摘要: 关联分类具有较高的分类精度和较强的适应性,然而由于分类器是由一组高置信度的规则构成,有时会存在过度拟合问题。提出了基于规则兴趣度的关联分类(ACIR)。它扩展了TD-FP-growth算法,使之有效地挖掘训练集,产生满足最小支持度和最小置信度的有趣的规则。通过剪枝选择一个小规则集构造分类器。在规则剪枝过程中,采用规则兴趣度来评价规则的质量,综合考虑规则的预测精度和规则中项的兴趣度。实验结果表明该方法在分类精度上优于See5、CBA和CMAR,并且具有较好的可理解性和扩展性。

关键词: 数据挖掘, 关联分类, 类关联规则, 规则兴趣度

Abstract: Associative classification has high classification accuracy and strong flexibility.However,it still suffers from overfitting since the classification is based on high confidence rules.This paper has proposed a new Associative Classification based on Interestingness of Rules(ACIR).ACIR extends TD-FP-growth to mine interesting rules with min-support and min-confidence,prunes rules and selects a small rule set to build the classifier.ACIR evaluates rules based on interestingness of rules which includes predictive accuracy and interestingness of the rule items.Experimental results show that ACIR has better classification accuracy in comparison with See5,CBA and CMAR and are highly comprehensible and scalable.

Key words: data mining, associative classification, class association rules, interestingness of rules