Computer Engineering and Applications ›› 2008, Vol. 44 ›› Issue (21): 99-101.DOI: 10.3778/j.issn.1002-8331.2008.21.027

• 机器学习 • Previous Articles     Next Articles

Decision forest learning algorithm based on relational data analysis

WANG Li-min,LI Xiong-fei   

  1. College of Computer Science and Technology,Jilin University,Changchun 130012,China
  • Received:2008-04-30 Revised:2008-06-13 Online:2008-07-21 Published:2008-07-21
  • Contact: WANG Li-min

基于关系数据分析的决策森林学习方法

王利民,李雄飞   

  1. 吉林大学 计算机科学与技术学院,长春 130012
  • 通讯作者: 王利民

Abstract: Multiple classifier integration in pattern recognition has received more attention and becomes one research hot.This paper proposes a multiple submodel integration algorithm based on decision forest construction.By giving distinct classification rule to each sample,decision forest rather than decision tree is constructed to automatically determine relatively independent attribute subset,and based on this we integrate submodel by applying conditional independence assumption.The whole learning procedure do not need any human interference.The independent structure of each subtree and the number of decision trees can be determined,which can help different classifiers to play advantage on different samples and domains.Theory analysis and experimental study on UCI data sets prove its feasibility and effectiveness.

Key words: pattern recognition, multiple submodel integration, decision forest, conditional independence assumption

摘要: 模式识别中的多分类器集成日益得到研究人员的关注并成为研究的热点。提出一种基于决策森林构造的多重子模型集成方法,通过对每个样本赋予决策规则,构造决策森林而非单个决策树以自动确定相对独立的样本子集,在此基础上结合条件独立性假设进行模型集成。整个学习过程不需要任何人为参与,能够自适应确定决策树数量和每个子树结构,发挥各分类器在不同样本和不同区域上的分类优势。在UCI机器学习数据集上的实验结果和样例分析验证了方法的有效性。

关键词: 模式识别, 多重子模型集成, 决策森林, 条件独立性假设