计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (10): 139-144.DOI: 10.3778/j.issn.1002-8331.2011-0040

• 模式识别与人工智能 • 上一篇    下一篇

融合信息增益与基尼指数的决策树算法

谢鑫,张贤勇,杨霁琳   

  1. 1.四川师范大学 数学科学学院,成都 610066
    2.四川师范大学 智能信息与量子信息研究所,成都 610066
    3.四川师范大学 计算机科学学院,成都 610066
  • 出版日期:2022-05-15 发布日期:2022-05-15

Decision Tree Algorithm Fusing Information Gain and Gini Index

XIE Xin, ZHANG Xianyong, YANG Jilin   

  1. 1.School of Mathematical Sciences, Sichuan Normal University, Chengdu 610066, China
    2.Institute of Intelligent Information and Quantum Information, Sichuan Normal University, Chengdu 610066, China
    3.College of Computer Science, Sichuan Normal University, Chengdu 610066, China
  • Online:2022-05-15 Published:2022-05-15

摘要: 机器学习中的决策树算法具有重要的数据分类功能,但基于信息增益的ID3算法与基于基尼指数的CART算法的分类功效还值得提高。构造信息增益与基尼指数的自适应集成度量,设计有效的决策树算法,以提升ID3与CART两类基本算法的性能。分析信息增益信息表示与基尼指数代数表示的异质无关性,采用基于知识的加权线性组合来建立信息增益与基尼指数的融合度量,开发决策树启发构造算法IGGI。关于决策树,IGGI算法有效改进了ID3算法与CART算法,相关数据实验表明IGGI算法通常具有更优的分类准确度。

关键词: 决策树, 信息增益, 基尼指数, 不确定性度量, 自适应线性融合, 机器学习

Abstract: Decision tree algorithms in machine learning have significant functions for data classification, but both ID3 algorithm based on information gain and CART algorithm based on Gini index have improvement space for classification performances. An adaptive integrated measure of information gain and Gini index is proposed to establish a robust algorithm of decision trees, so as to promote the two classical algorithms. Heterogeneity and independence of information-expressive information gain and algebra-expressive Gini index are analyzed, a fused measure of information gain and Gini index is constructed by the knowledge-based weighted linear combination, and thus a heuristic algorithm(IGGI) inducing decision trees is designed. Regarding decision trees, IGGI algorithm effectively improves ID3 and CART algorithms, and it generally has better classification accuracy as verified by relevant data experiments.

Key words: decision tree, information gain, Gini index, uncertainty measurement, adaptive linear fusion, machine learning