Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (1): 110-117.DOI: 10.3778/j.issn.1002-8331.2002-0101
Previous Articles Next Articles
SHU Shike, LI Lu
Online:
Published:
舒时克,李路
Abstract:
Aiming at the complexity between the characteristics of high-dimensional datasets. This paper proposes replace L1 penalty in LR-Elastic Net with SCAD(Smoothly Clipped Absolute Deviation) penalty and MCP(Minimax Concave Penalty), constructs LR-SCAD and LR-MCP models respectively, and uses ADMM(Alternating Direction Method of Multipliers)algorithm to solve. Simulation experiments show that LR-Elastic Net model is good at handling small sample data with correlation features, while LR-SCAD and LR-MCP models perform well in large sample data with correlation features. At the same time, the paper establishes LR-Elastic Net, LR-SCAD and LR-MCP strategies, and applies them to the data of the CSI 300 Index. Back-test results show that LR-SCAD and LR-MCP strategies perform better than LR-Elastic Net strategies in highly correlated data.
Key words: Elastic Net, Smoothly Clipped Absolute Deviation(SCAD), Minimax Concave Penalty(MCP), Alternating Direction Method of Multipliers(ADMM) algorithm, logistic regression, multi-factor stock selection
摘要:
针对高维度数据集特征之间的复杂性,而传统的L1惩罚项不满足Oracle性质的无偏性,将逻辑回归弹性网(LR-Elastic Net)中的L1惩罚项替换为SCAD(Smoothly Clipped Absolute Deviation)和MCP(Minimax Concave Penalty)惩罚项,分别构建了LR-SCAD和LR-MCP模型,在保留稀疏性的同时满足了无偏性,并利用ADMM(Alternating Direction Method of Multipliers)算法进行求解。通过模拟实验发现,LR-Elastic Net模型能很好地处理特征存在相关性的小样本数据,而LR-SCAD和LR-MCP模型在特征存在相关性的大样本数据中表现较好;建立LR-Elastic Net、LR-SCAD和LR-MCP策略,并应用于沪深300指数成分股数据。回测结果显示,LR-SCAD和LR-MCP策略在股票相关性很强的数据中比LR-Elastic Net策略表现更好。
关键词: 弹性网(Elastic Net), SCAD, MCP, ADMM算法, 逻辑回归, 多因子选股
SHU Shike, LI Lu. Multi-factor Quantitative Stock Selection Strategy Based on Sparsity Penalty[J]. Computer Engineering and Applications, 2021, 57(1): 110-117.
舒时克,李路. 正则稀疏化的多因子量化选股策略[J]. 计算机工程与应用, 2021, 57(1): 110-117.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2002-0101
http://cea.ceaj.org/EN/Y2021/V57/I1/110