Computer Engineering and Applications ›› 2019, Vol. 55 ›› Issue (11): 93-101.DOI: 10.3778/j.issn.1002-8331.1811-0316

Previous Articles     Next Articles

Feature Selection of High-Dimensional Data Based on ABC and CRO Algorithm

ZHANG Ge, WANG Jianlin   

  1. School of Computer and Information Engineering, Henan University, Kaifeng, Henan 475004, China
  • Online:2019-06-01 Published:2019-05-30


张  戈,王建林   

  1. 河南大学 计算机与信息工程学院,河南 开封 475004

Abstract: The high-dimensional data set contains thousands of features that can be used for data analyzing and forecasting. However, these data sets have many unrelated or redundant features that affect the accuracy of data analyzing and forecasting. Existing classification techniques are difficult to accurately identify the best subset of features. Aiming at this problem, this paper proposes a feature selection method AB-CRO based on wrapper mode, which combines the advantages of Artificial Bee Colony algorithm(ABC) and improved Chemical Reaction algorithm(CRO) to select features. In view of the phenomenon that the superior individuals in the iterative process may be consumed during the chemical reaction process, an elite strategy is appropriately added to maintain the superiority of the population. It presents a proposed method comparing the benchmark algorithms ABC, CRO, and biometric selection methods based on GA, PSO, and ISFLA on public data sets. The experimental results show that the proposed algorithm improves in the recognition and classification accuracy of related subsets.

Key words: feature selection, biomedical dataset, artificial bee colony algorithm, chemical reaction optimization algorithm, elite retention strategy

摘要: 高维数据集包含了成千上万可用于数据分析和预测的特征,然而这些数据集存在许多不相关或冗余特征,影响了数据分析和预测的准确性。现有分类技术难以准确地识别最佳特征子集。针对该问题,提出了一种基于wrapper模式的特征选择方法AB-CRO,该方法结合了人工蜂群算法(ABC)和改进的化学反应算法(CRO)的优点进行特征选择。针对迭代过程中较优的个体可能在化学反应过程中被消耗掉的现象,适当地加入精英策略来保持种群的优良性。实验结果表明,AB-CRO算法在最佳特征子集的识别和分类精度方面相对于基准算法ABC,CRO以及基于GA,PSO和混合蛙跳算法都所有改进。

关键词: 特征选择, 生物数据, 人工蜂群算法, 化学反应优化算法, 精英保留策略