Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (16): 305-315.DOI: 10.3778/j.issn.1002-8331.2205-0415

• Engineering and Applications • Previous Articles     Next Articles

Study of Multi-Factor Quantitative Stock Selection Based on SCDF Algorithm with Feature Permutation

WANG Wenxuan, LI Lu   

  1. School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
  • Online:2023-08-15 Published:2023-08-15

基于特征换序的SCDF多因子量化选股研究

王文轩,李路   

  1. 上海工程技术大学 数理与统计学院,上海 201620

Abstract: The sub-layer connection deep forests(SCDF) classification algorithm is developed by extending the sub-layer reconnection of the cascaded layers of the deep forests. By passing the misclassification information between the cascaded sub-layers, the subsequent sub-layers can obtain the corrected features of the previous sub-layers, thus effectively improving the convergence speed and correct classification rate of the algorithm. In the multi-granularity scan structure of deep forest, the importance of data features is ranked using out-of-bag error, so that factors with higher importance can participate in the multi-granularity scan multiple times, compensating for the sampling imbalance of the multi-granularity scan of deep forest. A SCDF multi-factor stock selection model based on feature permutation is constructed. Experiments show that the SCDF multi-factor stock selection model based on feature permutation has an annualised return of 26.47% and a cumulative return of 120% for CSI 300 stocks from January 2020 to January 2022, outperforming the return of deep forest.

Key words: deep forest, feature permutation, sub-layer connection, multi-factor stock selection

摘要: 对深度森林的级联层进行子层再连接,建立了子层连接深度森林(sub-layer connection deep forests,SCDF)的分类算法。级联子层之间通过错误分类信息的传递,使得后续的子层能获得前子层的修正特征,从而有效提升了算法的收敛速度和分类正确率。在深度森林的多粒度扫描部分,利用袋外误差对数据特征的重要性进行排序,使重要性较高的因子可多次参与多粒度扫描,弥补了深度森林多粒度扫描的采样不平衡的缺点,并构建了基于特征换序的SCDF多因子选股模型。实验表明,基于特征换序的SCDF多因子选股模型在2020年1月—2022年1月的沪深300股票的年化收益率为26.47%,累计收益率达到120%,优于深度森林的收益率。

关键词: 深度森林, 特征换序, 子层连接, 多因子选股