Research on Financial Trading Algorithm Based on Deep Reinforcement Learning

doi:10.3778/j.issn.1002-8331.2109-0507

Abstract

Abstract: Trading strategy plays a very important role in automated stock trading. How to select trading strategy in the complex and dynamic financial market is an important research direction of modern finance program trading. Reinforcement learning algorithm is to find the optimal dynamic trading strategy via interaction with the actual environment and maximize the benefits. This paper proposes a mixed CNN with LSTM end-to-end deep reinforcement learning automated trading algorithm, CNN module perceives stock dynamic market environments and dynamically extracts features, LSTM module learns dynamic time sequence rules, the eventually profits are accumulated through the deep reinforcement learning which interacts with the unknown environment, and trading strategies are made finally. Experiments on real stock data show that this method is significantly better than the benchmark method, with better scalability and robustness.

Key words: trading strategy, reinforcement learning, deep learning, quantitative finance

摘要： 交易策略在金融资产交易中具有十分重要的作用，如何在复杂动态金融市场中自动化选择交易策略是现代金融重要研究方向。强化学习算法通过与实际环境交互作用，寻找最优动态交易策略，最大化获取收益。提出了一个融合了CNN与LSTM的端到端深度强化学习自动化交易算法，CNN模块感知股票动态市场条件以及抽取动态特征，LSTM模块循环学习动态时间序列规律，最后通过强化学习方法累积最终收益并做出交易策略。在真实股票数据上的实验结果表明，该方法显著优于基准方法，可扩展性更强，鲁棒性更好。

关键词: 交易策略, 强化学习, 深度学习, 量化金融

XU Jie, ZHU Yukun, XING Chunxiao. Research on Financial Trading Algorithm Based on Deep Reinforcement Learning[J]. Computer Engineering and Applications, 2022, 58(7): 276-285.

许杰, 祝玉坤, 邢春晓. 基于深度强化学习的金融交易算法研究[J]. 计算机工程与应用, 2022, 58(7): 276-285.

References

[1] 刘全，翟建伟，章宗长，等.深度强化学习综述[J].计算机学报，2018，41（1）：1-27.
LIU Q，ZHAI J W，ZHANG Z Z，et al.A survey on deep reinforcement learning[J].Chinese Journal of Computers，2018，41（1）：1-27.
[2] 刘建伟，高峰，罗雄麟.基于值函数和策略梯度的深度强化学习综述[J].计算机学报，2019，42（6）：1406-1438.
LIU J W，GAO F，LUO X L，Survey of deep reinforcement learning based on value function and policy gradient[J].Chinese Journal of Computers，2019，42（6）：1406-1438.
[3] JIANG Z，XU D，LIANG J.A deep reinforcement learning framework for the financial portfolio management problem[J].arXiv：1706.10059，2017.
[4] SATO Y.Model-free reinforcement learning for financial portfolios：A brief survey[J].arXiv：1904.04973，2019.
[5] MNIH V，KAVUKCUOGLU K，SILVER D，et al.Human-level control through deep reinforcement learning[J].Nature，2015，518：529-533.
[6] ZARKIAS K S，PASSALIS N，TSANTEKIDIS A，et al，Deep reinforcement learning for financial trading using price trailing[C]//Proceedings of International Conference on Acoustics，Speech and Signal Processing，2019：3067-3071.
[7] CHAKOLE J B，KOLHE M S，MAHAPURUSH G D，et al.Q-learning agent for automated trading in equity stock markets[J].Expert Systems with Applications，2021，163：113761.
[8] PENDHARKAR P C，CUSATIS P.Trading financial indices with reinforcement learning agents[J].Expert Systems with Applications，2018，103：1-13.
[9] ABOUSSALAH A M，LEE C G.Continuous control with stacked deep dynamic recurrent reinforcement learning for portfolio optimization[J].Expert Systems with Applications，2020，140：112891.
[10] JIN O，EL-SAAWY H.Portfolio management using reinforcement learning[D].Stanford University，2016.
[11] HU Z，ZHAO Y，KHUSHI M.A survey of forex and stock price prediction using deep learning[J].Applied System Innovation，2021，4（1）：9.
[12] 李锦珂.基于深度策略梯度方法的量化交易策略研究[D].上海：上海交通大学，2019.
LI J K.Research on quantitative trading strategy based on depth strategy gradient method[D].Shanghai：Shanghai Jiao Tong University，2019.
[13] BOROVKOVA S，TSIAMAS I.An ensemble of LSTM neural networks for high-frequency stock market classification[J].Journal of Forecasting，2019，38（6）：600-619.
[14] CIPILOGLU YILDIZ Z，YILDIZ S B.A portfolio construction framework using LSTM-based stock markets forecasting[J].International Journal of Finance & Economics，2020（1）.
[15] TIWARI S，CHATURVEDI A K.A survey on LSTM-based stock market prediction[J].Ilkogretim Online，2021，20（5）.
[16] HIEW J Z G，HUANG X，MOU H，et al.BERT-based financial sentiment index and LSTM-based stock return predictability[J].arXiv：1906.09024，2019.
[17] BAO W，YUE J，RAO Y.A deep learning framework for financial time series using stacked autoencoders and long-short term memory[J].PloS One，2017，12（7）：e0180944.
[18] GUDELEK M U，BOLUK S A，OZBAYOGLU A M.A deep learning based stock trading model with 2-D CNN trend detection[C]//Proceedings of Symposium Series on Computational Intelligence，2017：1-8.
[19] HOSEINZADE E，HARATIZADEH S.CNNpred：CNN-based stock market prediction using a diverse set of variables[J].Expert Systems with Applications，2019，129：273-285.
[20] CAI S，FENG X，DENG Z，et al.Financial news quantization and stock market forecast research based on CNN and LSTM[C]//Proceedings of International Conference on Smart Computing and Communication，2018：366-375.
[21] HAUSKNECHT M，STONE P.Deep recurrent Q-learning for partially observable mdps[C]//Proceedings of AAAI Fall Symposium Series，2015.
[22] CHEN L，GAO Q.Application of deep reinforcement learning on automated stock trading[C]//Proceedings of International Conference on Software Engineering and Service Science，2019：29-33.
[23] BERTOLUZZO F，CORAZZA M.Reinforcement learning for automatic financial trading：Introduction and some applications[D].University CáFoscari of Venice，2012.
[24] DENG Y，BAO F，KONG Y，et al.Deep direct reinforcement learning for financial signal representation and trading[J].IEEE Transactions on Neural Networks and Learning Systems，2016，28（3）：653-664.
[25] JEONG G，KIM H Y.Improving financial trading decisions using deep Q-learning：Predicting the number of shares，action strategies，and transfer learning[J].Expert Systems with Applications，2019，117：125-138.
[26] CARTA S，FERREIRA A，PODDA A S，et al.Multi-DQN：An ensemble of deep Q-learning agents for stock market forecasting[J].Expert Systems with Applications，2021，164：113820.
[27] LIU Y，LIU Q，ZHAO H，et al.Adaptive quantitative trading：An imitative deep reinforcement learning approach[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2020：2128-2135.
[28] SUTTON R S，PRECUP D，SINGH S.Between MDPs and semi-MDPs：A framework for temporal abstraction in reinforcement learning[J].Artificial Intelligence，1999，112（1/2）：181-211.
[29] MNIH V，KAVUKCUOGLU K，SILVER D，et al.Playing atari with deep reinforcement learning[J].arXiv：1312. 5602，2013.
[30] SHI Y，LI W，ZHU L，et al.Stock trading rule discovery with double deep Q-network[J].Applied Soft Computing，2021，107：107320.
[31] BRIM A.Deep reinforcement learning pairs trading with a double deep Q-network[C]//Proceedings of 2020 10th Annual Computing and Communication Workshop and Conference，2020：222-227.
[32] SHIN H G，RA I，CHOI Y H.A deep multimodal reinforcement learning system combined with CNN and LSTM for stock trading[C]//Proceedings of 2019 International Conference on Information and Communication Technology Convergence，2019：1-7.
[33] ALMAHDI S，YANG S Y.An adaptive portfolio trading system：A risk-return portfolio optimization using recurrent reinforcement learning with expected maximum drawdown[J].Expert Systems with Applications，2017，87：267-279.
[34] BAILEY D H，LOPEZ DE PRADO M.The sharpe ratio efficient frontier[J].Journal of Risk，2012，15（2）：13.