Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (22): 199-207.DOI: 10.3778/j.issn.1002-8331.2105-0510

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

High-Dimensional Data Feature Selection Algorithm Based on Multifactor Particle Swarm Optimization

LIN Weixing, WANG Yujia, CHEN Wanfen, LIANG Haina   

  1. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
  • Online:2021-11-15 Published:2021-11-16



  1. 上海工程技术大学 电子电气工程学院,上海 201620


Feature selection is an important data preprocessing technique in the field of machine learning and data mining. It aims to maximize the accuracy of classification tasks and minimize the number of optimal subset features. Using the particle swarm algorithm to find the optimal subset in the high-dimensional dataset is faced with the problems of falling into the local optimum and expensive calculations, resulting in a decrease in classification accuracy. To solve this problem, a high-dimensional data feature selection algorithm based on multifactor particle swarm optimization is proposed. Firstly, the evolutionary multi-task algorithm framework is introduced, and a two-task model generation strategy is proposed, which strengthens population communication through knowledge transfer between tasks and improves population diversity to improve the shortcomings that tend to fall into local optimum. Secondly, the design is based on the initial strategy of sparse representation, the initial solution with sparse representation is designed in the initial stage of the algorithm, which reduces the computational cost of the population when it tends to the optimal solution set. The experimental results on 6 public medical high-dimensional datasets show that the proposed algorithm can effectively achieve the classification task and obtain better accuracy.

Key words: high-dimensional data, feature selection, evolutionary multitasking, Particle Swarm Optimization(PSO)



关键词: 高维数据, 特征选择, 进化多任务, 粒子群算法(PSO)