Computer Engineering and Applications ›› 2015, Vol. 51 ›› Issue (5): 126-131.

Previous Articles     Next Articles

Number of trees in random forest

LIU Min, LANG Rongling, CAO Yongbin   

  1. College of Electrical and Information Engineering, Beihang University, Beijing 100191, China
  • Online:2015-03-01 Published:2015-04-08


刘  敏,郎荣玲,曹永斌   

  1. 北京航空航天大学 电子信息工程学院,北京 100191

Abstract: Random Forest (RF) is a kind of ensemble classifier. This paper analyses the parameters influencing the performance of RF, and the result shows that the number of trees in random forest has significant effect on its performance. This paper carries on a research and summary on the method of determining the number of trees and evaluating the performance index of RF, with the classification accuracy used as the evaluation method, utilizing UCI data sets, an experimental analysis on the relationship between the number of decision trees in random forest and the data sets has been done. The experimental result shows that for the majority of data sets, when the number of trees is 100, the classification accuracy can meet the requirement. This paper compares RF with support vector machine having superior classification performance in the aspect of accuracy, and the result shows that the classification performance of random forest is similar to that of support vector machine.

Key words: random forest, support vector machine, classification accuracy

摘要: 随机森林是一种集成分类器,对影响随机森林性能的参数进行了分析,结果表明随机森林中树的数量对随机森林的性能影响至关重要。对树的数量的确定方法以及随机森林性能指标的评价方法进行了研究与总结。以分类精度为评价方法,利用UCI数据集对随机森林中决策树的数量与数据集的关系进行了实验分析,实验结果表明对于多数数据集,当树的数量为100时,就可以使分类精度达到要求。将随机森林和分类性能优越的支持向量机在精度方面进行了对比,实验结果表明随机森林的分类性能可以与支持向量机相媲美。

关键词: 随机森林, 支持向量机, 分类精度