计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (25): 146-148.DOI: 10.3778/j.issn.1002-8331.2008.25.044

• 数据库、信号与信息处理 • 上一篇    下一篇

一种基于模糊增益比例的决策树属性选择方法

严志嘉1,2,金连甫1   

  1. 1.浙江大学 计算机学院,杭州 310027
    2.浙江育英职业技术学院,杭州 310018
  • 收稿日期:2007-11-01 修回日期:2008-01-23 出版日期:2008-09-01 发布日期:2008-09-01
  • 通讯作者: 严志嘉

One choosing method of decision tree based on fuzzy gain ratio

YAN Zhi-jia1,2,JIN Lian-fu1   

  1. 1.Computer Institute of Zhejiang University,Hangzhou 310027,China
    2.Zhejiang Yuying College,Hangzhou 310018,China
  • Received:2007-11-01 Revised:2008-01-23 Online:2008-09-01 Published:2008-09-01
  • Contact: YAN Zhi-jia

摘要: 节点属性的选择是决策树生成过程中的关键环节,以ID3和C4.5为代表的经典决策树算法中,树节点的选择是通过子集样本数计算信息增益或增益比例得到的。但是,对于连续性属性,由于离散化分割导致了子集边界元素在隶属关系上的模糊,使样本计算的方式存在了一定的不合理性,为解决这一问题,采用了模糊集理论并以模糊度的方式取代样本个数参与增益比例的计算,给出了一种获得决策树分类中不确定性尺度的可行途径。

关键词: 决策树, 模糊集, 模糊增益比例, 聚类

Abstract: The choosing of node attribute is the pivotal tache during the building process of decision tree.ID3 and C4.5 are the representations of classical decision tree arithmetic,in which tree node is chosen by computing the information gain or gain ratio on the basis of the number of subset.However,due to continuity attribute,dispersed partition result in the faintness of subjection of subset boundary element,which makes the method of sample computing illogical.Adopting fuzzy set theory and using the way of fuzzy gain ratio instead of the way of the number of sample participating in plus property computing,this paper presents one feasible method of uncertainty scale in gaining decision tree classification.

Key words: decision tree, fuzzy set, fuzzy gain ratio, clustering