计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (7): 81-95.DOI: 10.3778/j.issn.1002-8331.2406-0410

• 热点与综述 • 上一篇    下一篇

稳定性特征选择研究综述

刘梓萱,杜建强,罗计根,黄强,贺佳,李益雯,秦紫瑜   

  1. 1.江西中医药大学 计算机学院,南昌 330004
    2.南昌大学,南昌 330031
    3.江西中医药大学 中医人工智能重点研究室,南昌 330004
  • 出版日期:2025-04-01 发布日期:2025-04-01

Review of Stability Feature Selection

LIU Zixuan, DU Jianqiang, LUO Jigen, HUANG Qiang, HE Jia, LI Yiwen, QIN Ziyu   

  1. 1.College of Computer Science, Jiangxi University of Traditional Chinese Medicine, Nanchang 330004, China
    2.Nanchang University, Nanchang 330031
    3.Key Laboratory of Artificial Intelligence in Chinese Medicine, Jiangxi University of Chinese Medicine, Nanchang 330004, China
  • Online:2025-04-01 Published:2025-04-01

摘要: 特征选择在高维数据预处理中扮演着重要角色,通过从原始特征集中挑选出最有利于模型性能提升的特征,可以有效地降低数据维数,提高模型的准确性和降低过拟合风险。稳定性是特征选择领域中一个不容忽视的关键研究内容,它指的是特征选择方法对训练样本的微小扰动具有一定的鲁棒性。深入剖析了特征选择过程中产生不稳定性的多重成因;系统归纳并对比了多种提升稳定性的方法,详细阐述了各类方法的目标和评估标准及其独特优势和潜在缺陷;详尽介绍了评估特征选择稳定性的指标的性质,并对稳定性指标进行解析和细致分类;探讨了稳定性特征选择领域存在的问题及对未来的展望,以期为后续的研究和实践提供有价值的参考。

关键词: 特征选择, 稳定性度量, 集成特征选择, 不稳定性

Abstract: Feature selection plays a crucial role in preprocessing high-dimensional data. By selecting the most relevant features from the original set, it effectively reduces data dimensionality, improves model accuracy, and mitigates overfitting risk. Stability is a key focus in feature selection research, referring to the robustness of methods to small perturbations in training samples. Initially, this paper conducts an extensive analysis on various factors contributing to instability during feature selection processes. Subsequently, it systematically summarizes and compares diverse methods aimed at enhancing stability while providing detailed descriptions of their objectives, evaluation criteria as well as unique advantages and potential limitations. Furthermore, it meticulously introduces properties associated with assessing feature selection stability indexes followed by a comprehensive analysis and classification thereof. Ultimately, it delves into discussing challenges within stable feature selections alongside future prospects with an aim to offer valuable insights for subsequent research endeavors.

Key words: feature selection, stability measure, ensemble feature selection, instability