Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (15): 286-296.DOI: 10.3778/j.issn.1002-8331.2011-0419

Previous Articles    

Random Forest Model Stock Price Prediction Based on Pearson Feature Selection

YAN Zhengxu, QIN Chao, SONG Gang   

  1. 1.School of Finance, Shandong University of Finance and Economics, Jinan 250014, China
    2.School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan 250014, China
    3.School of Mathematics, Shandong University, Jinan 250100, China
  • Online:2021-08-01 Published:2021-07-26

基于Pearson特征选择的随机森林模型股票价格预测

闫政旭,秦超,宋刚   

  1. 1.山东财经大学 金融学院,济南 250014
    2.山东财经大学 计算机科学与技术学院,济南 250014
    3.山东大学 数学学院,济南 250100

Abstract:

In order to better predict the trend of stocks, the problem of low prediction accuracy under a large number of features and big data is solved.In this study, a new combinational model method of random forest based on Pearson coefficient is proposed on the basis of random forest. Pearson coefficient is used for correlation test to remove irrelevant features.The improved grid search method is used to optimize the decision tree parameters. Stochastic forest is used for modeling regression prediction of residual characteristics, and a final conclusion is drawn.The experimental results show that the MAE and MSE of the improved random forest are greatly improved.Among them, the MSE value and MAE value of the improved random forest are 56% and 37.3% lower than those of the traditional random forest, and the prediction effect of the other two stocks is also improved.The new portfolio model can realize the short-term forecast regression of stock price and reduce the influence of noise on stock price forecast.This study provides effective evidence for better forecasting of stock prices and provides investors with the choice of factors influencing the stock.

Key words: Pearson coefficient, random forest, stock, predict

摘要:

为了能够更好地预测股票的走向趋势,解决在大量特征和大数据下预测精度低的问题,在随机森林的基础上提出了一种基于Pearson系数的随机森林新的组合模型方法。利用Pearson系数进行相关性检验删除无关特征;使用改进的网格搜索法对决策树参数调优;利用随机森林将剩余特征进行建模回归预测,并得出最终结论。实验结果表明:改进后的随机森林在预测值的平均绝对误差(MAE)、均方误差(MSE)都得到了较大的提高。其中今世缘改进后的随机森林比传统随机森林的MSE值降低了56%,MAE值降低了37.3%,其他两只股票预测效果也均得到提高。新的组合模型,可以实现对股票价格的短期预测回归,并且能够降低噪声对股票价格预测的影响。该研究为更好地预测股票价格提供了有效证据并为投资者提供了对股票影响因素的选择。

关键词: Pearson系数, 随机森林, 股票, 预测