计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (25): 217-219.DOI: 10.3778/j.issn.1002-8331.2010.25.063

• 工程与应用 • 上一篇    下一篇

一种多变量决策树的构造与研究

陈广花,王正群,刘 风,俞振州   

  1. 扬州大学 信息工程学院,江苏 扬州 225009
  • 收稿日期:2009-02-20 修回日期:2009-04-14 出版日期:2010-09-01 发布日期:2010-09-01
  • 通讯作者: 陈广花

Multi-variable decision tree construction and research

CHEN Guang-hua,WANG Zheng-qun,LIU Feng,YU Zhen-zhou   

  1. School of Information Engineering,Yangzhou University,Yangzhou,Jiangsu 225009,China
  • Received:2009-02-20 Revised:2009-04-14 Online:2010-09-01 Published:2010-09-01
  • Contact: CHEN Guang-hua

摘要: 单变量决策树算法造成树的规模庞大、规则复杂、不易理解,而多变量决策树是一种有效用于分类的数据挖掘方法,构造的关键是根据属性之间的相关性选择合适的属性组合构成一个新的属性作为节点。结合粗糙集原理中的知识依赖性度量和信息系统中条件属性集的离散度概念,提出了一种多变量决策树的构造算法(RD)。在UCI上部分数据集的实验结果表明,提出的多变量决策树算法的分类效果与传统的ID3算法以及基于核方法的多变量决策树的分类效果相比,有一定的提高。

关键词: 决策树, 粗糙集, 属性依赖度, 离散度

Abstract: Decision tree algorithm in univariate tests causes large-scale,complex rules that are difficult to understand.Multi-variable decision tree is effectively used in the classification of data mining.The key to build it lies in the reasonable choice of attributes combination based on the interconnection between attributes.Based on the rough set theory of attribute dependability and the concept of conditional attributes dispersion degree in information system,a new multi-variable decision tree algorithm called RD is proposed.The results of experiments on the UCI show that the decision tree built by the proposed method has better classification results than those of ID3 algorithm and multi-variate decision tree construction algorithm based on the relative core of attributes.

Key words: decision tree, rough set, attribute dependability, dispersion degree

中图分类号: