计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (18): 293-300.DOI: 10.3778/j.issn.1002-8331.2206-0031

• 工程与应用 • 上一篇    下一篇

改进的NSGA-III-XGBoost算法在股票预测中的应用

何泳,李环   

  1. 东莞理工学院 计算机科学与技术学院,广东 东莞 523000
  • 出版日期:2023-09-15 发布日期:2023-09-15

Application of Improved NSGA-III-XGBoost Algorithm in Stock Forecasting

HE Yong, LI Huan   

  1. College of Computer Science and Technology, Dongguan University of Technology, Dongguan, Guangdong 523000, China
  • Online:2023-09-15 Published:2023-09-15

摘要: 为提高股票预测的准确度和减少运行时间,提出了一种改进的非支配排序遗传算法与极致梯度提升树模型相结合(INSGA-III-XGBoost)的股票预测模型。该模型特征工程包括小波分解、扩展特征、数据清洗、归一化。模型采用两种过滤式特征选择的集成信息初始化种群优化NSGA-III算法,以最大化准确度和最小化解的解决方案大小作为优化方向,使用多染色体混合编码的方式同步进行特征选择和优化模型参数。将选择的特征子集和参数输入XGBoost训练预测并迭代优化。实验结果表明,INSGA-III-XGBoost算法与未改进的多目标特征选择算法和单目标特征选择算法相比,平均准确度最高、解方案最小、运行时间最短;与深度学习模型相比,不仅准确度更高、运行用时大幅减少,并且该模型具有可解释性。

关键词: 多目标优化, 特征工程, 特征选择, 股票预测

Abstract: To improve the accuracy of stock forecasting and reduce the running time, a stock forecasting model combining an improved non-dominated sorting genetic algorithm and extreme gradient boosting tree model(INSGA-III-XGBoost) is proposed. The model feature engineering includes wavelet decomposition, extended features, data cleaning, and normalization. The model uses the integrated information of two types of filtered feature selection to initialize the population optimization NSGA-III algorithm, maximize the accuracy and minimize the solution size of the solution as the optimization direction, and use the multi-chromosome hybrid encoding method to simultaneously perform feature selection and optimize model parameters. The selected feature subsets and parameters are input into XGBoost for training and forecasting, and iteratively optimizes according to the evaluation metrics. The experimental results show that compared with the unimproved multi-objective feature selection algorithm and single-objective feature selection algorithm, the INSGA-III-XGBoost algorithm has the highest average accuracy, the smallest solution scheme, and the shortest running time; compared with the deep learning model, it not only has higher accuracy, but the runtime is greatly reduced as well, and the model is interpretable.

Key words: multi-objective optimization, feature engineering, feature selection, stock forecasting