Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (8): 297-305.DOI: 10.3778/j.issn.1002-8331.2112-0142

• Engineering and Applications • Previous Articles     Next Articles

Feature Mining for Financial Texts and Research on Dynamic Factor Fusion Strategy

ZHANG Wei, ZHU Hanqing, GAO Zhigang   

  1. 1.College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
    2.Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province, Hangzhou 310018, China
  • Online:2023-04-15 Published:2023-04-15



  1. 1.杭州电子科技大学 计算机学院,杭州 310018
    2.浙江省脑机协同智能重点实验室,杭州 310018

Abstract: Current analysis of financial texts is limited by non-normative financial texts, and the extracted financial features are not effective enough. To solve this problem, an normative finanical text feature mining(NFTFM) model that studies brokerage research reports is proposed to extract valid financial feature factors from normative financial texts. Firstly, the normative finanical text sentiment dictionary(NFTSD) is proposed to fully mine the semantics of brokerage reporting. Then, it uses the [K]-nearest neighbor algorithm(KNN) to classify the attitude tendency of the report authors. Finally, it integrates the attitude classification results into two financial characteristic factors which include rate consistency(RV) factor and rate consistency(RC) factor according to the temporal dimension. Aiming at the problem that the factor weight of traditional quantitative multi-factor model can not adapt to the market change, the fusion factor strategy of dynamic optimization is proposed, and the weight of the factors are dynamically optimized by genetic algorithm. In order to verify the effectiveness of normative financial characteristic factors and the effect of dynamic optimization fusion factor strategy, the RC and RV factors are taken as the basic factor set to construct a multi-factor strategy instance for China Securities 500 stocks and carry out historical cycle backtest. The results show that the strategy returns have significantly improved compared with the benchmark returns, and have good adaptability to different market environments, indicating that NFTFM model effectively extracts standardized financial feature factors, and the factors under the dynamic optimization of the fusion factor strategy have the ability to adapt to market changes.

Key words: normative financial text, data analysis, [K]-nearest neighbor(KNN), multi-factor strategy, genetic algorithm

摘要: 目前的金融文本分析受到非规范性金融文本的局限性,所提取的金融特征有效性不足。为解决这一问题,提出了以券商研究报告为研究对象的规范性金融文本特征挖掘模型(normative finanical text feature mining,NFTFM),通过构建规范性金融情感词典(normative finanical text sentiment dictionary,NFTSD)充分挖掘券商报告语义,并采用[K]邻近算法[(K]-nearest neighbor,KNN)实现报告作者评价态度分类,将态度分类结果按照时序维度整合为评价一致性因子(rate volatility,RC)和评价特征因子(rate consistency,RV)两类金融特征因子;针对传统量化多因子模型的因子权重无法自适应市场变化的问题,提出动态优化的融合因子策略,通过遗传算法动态优化因子权重。为验证规范性金融特征因子的有效性以及动态优化融合因子策略的效果,以RC、RV因子为基础因子集合,针对中证500股票构建多因子策略实例并展开历史周期回测。结果表明,策略收益相比于基准收益有明显提升,且对于不同的市场环境都具有较好的适应能力,表明NFTFM模型有效地提取了规范性金融特征因子,且动态优化的融合因子策略下的各类因子具有自适应市场变化的能力。

关键词: 规范性金融文本, 数据分析, [K]邻近算法(KNN), 多因子策略, 遗传算法