计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (21): 139-144.DOI: 10.3778/j.issn.1002-8331.1908-0242

• 模式识别与人工智能 • 上一篇    下一篇

MI和改进PCA的降维算法在股价预测中的应用

谢心蕊,雷秀仁,赵岩   

  1. 1.华南理工大学 数学学院 信息与计算科学系,广州 510640
    2.华南理工大学 数学学院 统计与金融数学系,广州 510640
  • 出版日期:2020-11-01 发布日期:2020-11-03

Application of Mutual Information and Improved PCA Dimensionality Reduction Algorithm in Stock Price Forecasting

XIE Xinrui, LEI Xiuren, ZHAO Yan   

  1. 1.Department of Computational Mathematics, School of Mathematics, South China University of Technology, Guangzhou 510640, China
    2.Department of Probability Theory and Mathematical Statistics, School of Mathematics, South China University of Technology, Guangzhou 510640, China
  • Online:2020-11-01 Published:2020-11-03

摘要:

考虑到单个特征对标签的有效性及多特征之间的信息冗余问题,提出一种联合互信息和改进PCA的双重降维方法。利用互信息对众多的特征进行初步筛选,舍弃一部分对标签信息贡献较低的特征,使用累积方差贡献率和复相关系数共同确定主元个数的主成分分析法进行二次降维,不仅保证了主元模型的信息容量,同时也避免了过多噪声的参与,从而保证了预测过程的准确性。通过神经网络对实际股票数据进行预测,表明了提出的降维算法的有效性。

关键词: 互信息, 改进PCA, 双重降维, 神经网络预测

Abstract:

Considering the validity of a single feature on a tag and the information redundancy between multiple features, a method of mutual information combine with improving PCA for double dimensionality reduction are proposed. The mutual information is used to initially select a part of features from a large number of features, and some features that contribute less to the tag information are discarded. The principal component analysis method that uses the cumulative variance contribution rate and the multi-correlation coefficient to determine the number of principal elements is used for secondary dimensionality reduction. It not only ensures the information capacity of the principal component model, but also avoids the participation of excessive noise, thus ensuring the accuracy of the prediction process. The prediction of a single stock data through neural network shows the effectiveness of the dimensionality reduction algorithm proposed in this paper.

Key words: Mutual Information(MI), improved PCA, double dimensionality reduction, neural network prediction