计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (33): 18-21.

• 博士论坛 • 上一篇    下一篇

利用差别矩阵构造决策树

高 静1,韩智东2   

  1. 1.首都经济贸易大学 信息学院,北京 100070
    2.中国银联 北京信息中心,北京 100193
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-11-21 发布日期:2011-11-21

Construction of decision tree with discernibility matrix

GAO Jing1,HAN Zhidong2   

  1. 1.School of Information,Capital University of Economics and Business,Beijing 100070,China
    2.Information Center of Beijing,China UnionPay Co.,Ltd,Beijing 100193,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-11-21 Published:2011-11-21

摘要: 分析了基于正区域、基于粗糙边界和基于依赖度的属性选择标准的关系,证明了这三种属性选择标准彼此等价。以正区域的属性选择标准为代表,分析了基于正区域的决策树生成算法的优点和不足。针对这些不足,提出基于差别元素的大小为新的属性选择标准。用新的属性选择标准生成的决策树一般具有叶子数目较少,叶子的平均深度也较小,且叶子具有较强的泛化能力。用一实例说明了新的属性选择标准的优越性。

关键词: 决策树, 粗糙集, 正区域, 粗糙边界, 依赖度, 差别矩阵

Abstract: The relationship between selected attribute standards based on positive region,based on rough bound and based on attribute dependency is analyzed.It is proved that the three kinds of selected attribute standards are equivalent to each other.Advantages and disadvantages of algorithm for constructing decision tree based on positive region are analyzed.Aiming at these disadvantages,a new selected attribute standard based on magnitude of the discernibility element is proposed.The decision tree constructed with the new standard of attribute selection has the following characteristics:fewer leaf nodes,fewer levels of average depth,better generalization of leaf nodes.An example is used to illustrate the advantages of this new selected attribute standard.

Key words: decision tree, rough set, positive region, rough bound, attribute dependency, discernibility matrix