计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (22): 54-64.DOI: 10.3778/j.issn.1002-8331.2203-0527
胡振威,汪廷华,周慧颖
出版日期:
2022-11-15
发布日期:
2022-11-15
HU Zhenwei, WANG Tinghua, ZHOU Huiying
Online:
2022-11-15
Published:
2022-11-15
摘要: 希尔伯特-施密特独立性准则(Hilbert-Schmidt independence criterion,HSIC)是一种基于核函数的独立性度量标准,具有计算简单、收敛速度快和偏差低等优点,广泛应用于统计分析和机器学习问题中。特征选择是一种有效的降维技术,它能评估特征的重要性,并构造适合学习任务的最优特征子空间。系统综述了基于HSIC的特征选择方法,详细介绍了其中的理论基础、算法模型和求解方法,分析了基于HSIC的特征选择的优点与不足,并对未来的研究做出展望。
胡振威, 汪廷华, 周慧颖. 基于核统计独立性准则的特征选择研究综述[J]. 计算机工程与应用, 2022, 58(22): 54-64.
HU Zhenwei, WANG Tinghua, ZHOU Huiying. Review of Feature Selection Methods Based on Kernel Statistical Independence Criteria[J]. Computer Engineering and Applications, 2022, 58(22): 54-64.
[1] KOTSIANTIS S.Feature selection for machine learning classification problems:a recent overview[J].Artificial Intelligence Review,2011,42(1):157-176. [2] 周志华.机器学习[M].北京:清华大学出版社,2016. ZHOU Z H.Machine learning[M].Beijing:Tsinghua University Press,2016. [3] JORDAN M,MITCHELL T.Machine learning:trends,perspectives,and prospects[J].Science,2015,349(6245):255-260. [4] HAN J,PEI J,KAMBER M.Data mining:concepts and techniques[M].Amsterdam:Elsevier,2011. [5] ATLURI G,KARPATNE A,KUMAR V.Spatio-temporal data mining:a survey of problems and methods[J].ACM Computing Surveys,2018,51(4):1-41. [6] INDYK P,MOTWANI R.Approximate nearest neighbors:towards removing the curse of dimensionality[C]//30th Annual ACM Symposium on Theory of Computing,1998:604-613. [7] CAI J,LUO J,WANG S,et al.Feature selection in machine learning:a new perspective[J].Neurocomputing,2018,300:70-79. [8] LI J,CHENG K,WANG S,et al.Feature selection:a data perspective[J].ACM Computing Surveys,2017,50(6):1-45. [9] HUANG R,WU Z.Multi-label feature selection via mani- fold regularization and dependence maximization[J].Pattern Recognition,2021,120:108149. [10] PONTE P,MELKO R.Kernel methods for interpretable machine learning of order parameters[J].Physical Review B,2017,96(20):205146. [11] SHAWE-TAYLOR J,CRISTIANINI N.Kernel methods for pattern analysis[M].Cambridge:Cambridge University Press,2004. [12] HOFMANN T,SCH?LKOPF B,SMOLA A.Kernel methods in machine learning[J].The Annals of Statistics,2008,36(3):1171-1220. [13] GRETTON A,HERBRICH.R,SMOLA A,et al.Kernel methods for measuring independence[J].Journal of Machine Learning Research,2005,6:2075-2129. [14] SIMON-GABRIEL C,SCH?LKOPF B.Kernel distribution embeddings:universal kernels,characteristic kernels and kernel metrics on distributions[J].Journal of Machine Learning Research,2018,19(1):1708-1736. [15] 王博,黄九鸣,贾焰,等.适用于多种监督模型的特征选择方法研究[J].计算机研究与发展,2010,47(9):1548-1557. WANG B,HUANG J M,JIA Y,et al.A study of feature selection methods applicable to multiple supervised models[J].Journal of Computer Research and Development,2010,47(9):1548-1557. [16] GRETTON A,BOUSQUET O,SMOLA A,et al.Measuring statistical dependence with Hilbert-Schmidt norms[C]//2005 International Conference on Algorithmic Learning Theory.Berlin,Heidelberg:Springer,2005:63-77. [17] BACH F,JORDAN M.Kernel independent component analysis[J].Journal of Machine Learning Research,2002,3:1-48. [18] GRETTON A,SMOLA A,BOUSQUET O,et al.Kernel constrained covariance for dependence measurement[C]//2005 International Workshop on Artificial Intelligence and Statistics,2005:112-119. [19] 张晨光,张燕,张夏欢.最大规范化依赖性多标记半监督学习方法[J].自动化学报,2015,41(9):1577-1588. ZHANG C G,ZHANG Y,ZHANG X H.Maximum normative-dependent multilabel semi-supervised learning methods[J].Acta Automatica Sinica,2015,41(9):1577-1588. [20] SONG L,SMOLA A,GRETTON A,et al.A dependence maximization view of clustering[C]//24th International Conference on Machine Learning,2007:815-822. [21] SHU X,LAI D,XU H,et al.Learning shared subspace for multi-label dimensionality reduction via dependence maximization[J].Neurocomputing,2015,168:356-364. [22] FEI X,ZHOU S,HAN X,et al.Doubly supervised parameter transfer classifier for diagnosis of breast cancer with imbalanced ultrasound imaging modalities[J].Pattern Recognition,2021,120:108139. [23] WANG T,DAI X,LIU Y.Learning with Hilbert-Schmidt independence criterion:a review and new perspectives[J].Knowledge-Based Systems,2021,234:107567. [24] 姚旭,王晓丹,张玉玺,等.特征选择方法综述[J].控制与决策,2012,27(2):161-166. YAO X,WANG X D,ZHANG Y X,et al.Overview of feature selection methods[J].Journal of Control and Decision,2012,27(2):161-166. [25] LIU C,MA Q,XU J.Multi-label feature selection method combining unbiased Hilbert-Schmidt independence criterion with controlled genetic algorithm[C]//International Conference on Neural Information Processing.Cham:Springer,2018:3-14. [26] SHEIKHPOUR R,SARRAM M,GHARAGHANI S,et al.A survey on semi-supervised feature selection methods[J].Pattern Recognition,2017,64:141-158. [27] SOLORIO-FERNáNDEZ S,CARRASCO-OCHOA J,MARTíNEZ- TRINIDAD J F.A review of unsupervised feature selection methods[J].Artificial Intelligence Review,2020,53(2):907-948. [28] 张晨光,张燕,张夏欢.从希尔伯特-施密特独立性中学习的多标签半监督学习方法[J].中国科技论文,2013,8(10):998-1002. ZHANG C G,ZHANG Y,ZHANG X H.A multi-label semi-supervised learning approach for learning from Hilbert-Schmidt independence[J].China Sciencepaper,2013,8(10):998-1002. [29] 刘涛,吴功宜,陈正.一种高效的用于文本聚类的无监督特征选择算法[J].计算机研究与发展,2005,42(3):381-386. LIU T,WU G Y,CHEN Z.An efficient unsupervised feature selection algorithm for text clustering[J].Journal of Computer Research and Development,2005,42(3):381-386. [30] LIAGHAT S,MANSOORI E.Filter-based unsupervised feature selection using Hilbert-Schmidt independence criterion[J].International Journal of Machine Learning and Cybernetics,2019,10(9):2313-2328. [31] LUO M,NIE F,CHANG X,et al.Adaptive unsupervised feature selection with structure regularization[J].IEEE Transactions on Neural Networks and Learning Systems,2017,29(4):944-956. [32] NIIJIMA S,OKUNO Y.Laplacian linear discriminant analysis approach to unsupervised feature selection[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2008,6(4):605-614. [33] 徐峻岭,周毓明,陈林,等.基于互信息的无监督特征选择[J].计算机研究与发展,2012,49(2):158-168. XU J L,ZHOU Y M,CHEN L,et al.Unsupervised feature selection based on mutual information[J].Journal of Computer Research and Development,2012,49(2):158-168. [34] KURNIAWATI I,PARDEDE H.Hybrid method of information gain and particle swarm optimization for selection of features of SVM-based sentiment analysis[C]//2018 International Conference on Information Technology Systems and Innovation,2018:1-5. [35] YANG Y,PEDERSEN J.A comparative study on feature selection in text categorization[C]//14th International Conference on Machine Learning,1997:412-420. [36] ZHAO Z,LIU H.Spectral feature selection for supervised and unsupervised learning[C]//24th International Conference on Machine Learning,2007:1151-1157. [37] GOH L,SONG Q,KASABOV N.A novel feature selection method to improve classification of gene expression data[C]//2nd Conference on Asia-Pacific Bioinformatics-Volume 29,2004:161-166. [38] JIN X,XU A,BIE R,et al.Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles[C]//2006 International Workshop on Data Mining for Biomedical Applications.Berlin,Heidelberg:Springer,2006:106-115. [39] KWAK N,CHOI C H.Input feature selection for classification problems[J].IEEE Transactions on Neural Networks,2002,13(1):143-159. [40] 施启军,潘峰,龙福海,等.特征选择方法研究综述[J].微电子学与计算机,2022,39(3):1-8. SHI Q J,PAN F,LONG F H,et al.A review of feature selection methods[J].Microelectronics & Computer,2022,39(3):1-8. [41] REUNANEN J.Overfitting in making comparisons between variable selection methods[J].Journal of Machine Learning Research,2003,3:1371-1382. [42] KOHAVI R,JOHN G.Wrappers for feature subset selection[J].Artificial Intelligence,1997,97(1/2):273-324. [43] MALDONADO S,LóPEZ J.Dealing with high-dimensional class-imbalanced datasets:embedded feature selection for SVM classification[J].Applied Soft Computing,2018,67:94-105. [44] WANG S,TANG J,LIU H.Embedded unsupervised feature selection[C]//29th AAAI Conference on Artificial Intelligence,2015:470-476. [45] LIU H,ZHOU M,LIU Q.An embedded feature selection method for imbalanced data classification[J].IEEE/CAA Journal of Automatica Sinica,2019,6(3):703-715. [46] CHANDRASHEKAR G,SAHIN F.A survey on feature selection methods[J].Computers & Electrical Engineering,2014,40(1):16-28. [47] PENG H,LONG F,DING C.Feature selection based on mutual information criteria of max-dependency,max-relevance,and min-redundancy[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(8):1226-1238. [48] FUKUMIZU K,GRETTON A,SUN X,et al.Kernel measures of conditional dependence[C]//Advances in Neural Information Processing Systems 20,2007:489-496. [49] ZHANG X,SONG L,GRETTON A,et al.Kernel measures of independence for non-iid data[C]//Advances in Neural Information Processing Systems 21,2008:1937-1944. [50] SONG L,SMOLA A,GRETTON A,et al.Feature selection via dependence maximization[J].Journal of Machine Learning Research,2012,13(5):1393-1434. [51] HERMES L,BUHMANN J.Feature selection for support vector machines[C]//15th IEEE International Conference on Pattern Recognition,2000,2:712-715. [52] BRADLEY P,MANGASARIAN O.Feature selection via concave minimization and support vector machines[C]//15th International Conference on Machine Learning,1998:82-90. [53] WESTON J,ELISSEEFF A,SCH?LKOPF B,et al.Use of the zero norm with linear models and kernel methods[J].Journal of Machine Learning Research,2003,3:1439-1461. [54] HE D,RISH I,PARIDA L.Transductive HSIC Lasso[C]//2014 International Conference on Data Mining,2014:154-162. [55] SONG L,BEDO J,BORGWARDT K,et al.Gene selection via the BAHSIC family of algorithms[J].Bioinformatics,2007,23(13):490-498. [56] SONG L,SMOLA A,GRETTON A,et al.Supervised feature selection via dependence estimation[C]//24th International Conference on Machine Learning,2007:823-830. [57] WESTON J,SCHLKOPF B,ESKIN E,et al.Dealing with large diagonals in kernel matrices[J].Annals of the Institute of Statistical Mathematics,2003,55(2):391-408. [58] SINAGA K,YANG M.Unsupervised K-means clustering algorithm[J].IEEE Access,2020,8:80716-80727. [59] NAKARIYAKUL S,CASASENT D.An improvement on floating search algorithms for feature subset selection[J].Pattern Recognition,2009,42(9):1932-1940. [60] STREARNS S.On selecting features for pattern classifiers[C]//1976 International Conference on Pattern Recognition,1976:71-75. [61] SOMOL P,PUDIL P,NOVOVI?OVá J,et al.Adaptive floating search methods in feature selection[J].Pattern Recognition Letters,1999,20:1157-1163. [62] LU K,SUN W,MA C,et al.Load forecast method of electric vehicle charging station using SVR based on GA-PSO[C]//IOP Conference Series:Earth and Environmental Science.2017,69:012196. [63] YANG X.A new metaheuristic bat-inspired algorithm[J].Nature Inspired Cooperative Strategies for Optimization,2010,284:65-74. [64] LOPEZ R,REGIER J,JORDAN M I,et al.Information constraints on auto-encoding variational Bayes[C]//Advances in Neural Information Processing Systems 31,2018:6117-6128. [65] GANGEH M J,ZARKOOB H,GHODSI A.Fast and scalable feature selection for gene expression data using Hilbert-Schmidt independence criterion[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2017,14(1):167-181. [66] MASAELI M,FUNG G,DY J.From transformation-based dimensionality reduction to feature selection[C]//27th International Conference on Machine Learning,2010. [67] TIBSHIRANI R.Regression shrinkage and selection via the Lasso:a retrospective[J].Journal of the Royal Statistical Society:Series B(Statistical Methodology),2011,73(3):267-288. [68] DONOHO D.For most large underdetermined systems of linear equations the minimal [?1]-norm solution is also the sparsest solution[J].Communications on Pure and Applied Mathematics,2006,59(6):797-829. [69] YAMADA M,JITKRITTUM W,SIGAL L,et al.High-dimensional feature selection by feature-wise kernelized Lasso[J].Neural Computation,2014,26(1):185-207. [70] 李军怀,付静飞,蒋文杰,等.基于mRMR的文本分类特征选择方法[J].计算机科学,2016,43(10):225-228. LI J H,FU J F,JIANG W J,et al.An mRMR-based feature selection method for text classification[J].Computer Science,2016,43(10):225-228. [71] REN W,LI B,HAN M.A novel Granger causality method based on HSIC-Lasso for revealing nonlinear relationship between multivariate time series[J].Physica A:Statistical Mechanics and its Applications,2020,541:123245. [72] YAMADA M,TANG J,LUGO-MARTINEZ J,et al.Ultra high-dimensional nonlinear feature selection for big biological data[J].IEEE Transactions on Knowledge and Data Engineering,2018,30(7):1352-1365. [73] DAMODARAN B,COURTY N,LEFèVRE S.Sparse Hilbert-Schmidt independence criterion and surrogate-kernel-based feature selection for hyperspectral image classification[J].IEEE Transactions on Geoscience and Remote Sensing,2017,55(4):2385-2398. [74] POIGNARD B,YAMADA M.Sparse Hilbert-Schmidt independence criterion regression[C]//23rd International Conference on Artificial Intelligence and Statistics,2020:538-548. [75] FRANK L,FRIEDMAN J.A statistical view of some chemometrics regression tools[J].Technometrics,1993,35(2):109-135. [76] FAN J,LI R.Variable selection via nonconcave penalized likelihood and its oracle properties[J].Journal of the American Statistical Association,2001,96(456):1348-1360. [77] ZHANG C.Nearly unbiased variable selection under minimax concave penalty[J].Annals of Statistics,2010,38(2):894-942. [78] YAMADA M,KIMURA A,NAYA F,et al.Change-point detection with feature selection in high-dimensional time-series data[C]//23rd International Joint Conference on Artificial Intelligence,2013. [79] KONG X,PHILIP S.gMLC:a multi-label feature selection framework for graph classification[J].Knowledge and Information Systems,2012,31(2):281-305. [80] CAMPS-VALLS G,MOOIJ J,SCHOLKOPF B.Remote sensing feature selection by kernel dependence measures[J].IEEE Geoscience and Remote Sensing Letters,2010,7(3):587-591. [81] XU J.Effective and efficient multi-label feature selection approaches via modifying Hilbert-Schmidt independence criterion[C]//23rd International Conference on Neural Information Processing.Cham:Springer,2016:385-395. [82] JIANG L,WANG J,YU G.Semi-supervised multi-label feature selection based on sparsity regularization and dependence maximization[C]//2018 9th International Conference on Intelligent Control and Information Processing,2018:325-332. [83] LIU Y,ZHANG C,ZHU P,et al.Generalized multi-view unsupervised feature selection[C]//27th International Conference on Artificial Neural Networks.Cham:Springer,2018:469-478. |
[1] | 汪玉, 王鑫, 张淑娟, 郑国强, 赵龙, 郑高峰. 异构大数据环境中高效率知识融合方法的研究[J]. 计算机工程与应用, 2022, 58(6): 142-148. |
[2] | 卢冰洁, 李炜卓, 那崇宁, 牛作尧, 陈奎. 机器学习模型在车险欺诈检测的研究进展[J]. 计算机工程与应用, 2022, 58(5): 34-49. |
[3] | 赵珍珍, 董彦如, 曹慧, 曹斌. 老年人跌倒检测算法的研究现状[J]. 计算机工程与应用, 2022, 58(5): 50-65. |
[4] | 黄彦乾, 迟冬祥, 徐玲玲. 面向小样本学习的嵌入学习方法研究综述[J]. 计算机工程与应用, 2022, 58(3): 34-49. |
[5] | 王怡忻, 朱湘茹, 杨利军. 融合共空间模式与脑网络特征的EEG抑郁识别[J]. 计算机工程与应用, 2022, 58(22): 150-158. |
[6] | 卓永泰, 董又铭, 高灿. 基于邻域互信息的三支特征选择[J]. 计算机工程与应用, 2022, 58(22): 159-164. |
[7] | 崔鑫, 徐华, 朱亮. 面向不均衡数据的多分类集成算法[J]. 计算机工程与应用, 2022, 58(2): 176-183. |
[8] | 李云龙, 卿粼波, 韩龙玫, 王昱晨. 视觉可供性研究综述[J]. 计算机工程与应用, 2022, 58(18): 1-15. |
[9] | 孙书魁, 范菁, 曲金帅, 路佩东. 生成式对抗网络研究综述[J]. 计算机工程与应用, 2022, 58(18): 90-103. |
[10] | 冯钧, 李艳, 杭婷婷. 问答系统中复杂问题分解方法研究综述[J]. 计算机工程与应用, 2022, 58(17): 23-33. |
[11] | 李郅琴, 杜建强, 聂斌, 熊旺平, 徐国良, 罗计根, 李冰涛. 基于黑寡妇算法的特征选择方法研究[J]. 计算机工程与应用, 2022, 58(16): 147-156. |
[12] | 周慧颖, 汪廷华, 张代俐. 多标签特征选择研究进展[J]. 计算机工程与应用, 2022, 58(15): 52-67. |
[13] | 孙超, 闻敏, 李鹏祖, 李瑶, Ibegbu Nnamdi JULIAN, 郭浩. 基于相对极差的不确定脑网络特征提取与分类[J]. 计算机工程与应用, 2022, 58(14): 126-133. |
[14] | 牛红丽, 赵亚枝. 利用Bagging算法和GRU模型预测股票价格指数[J]. 计算机工程与应用, 2022, 58(12): 132-138. |
[15] | 段刚龙, 王妍, 马鑫, 杨泽阳. 银行客户分类的数据特征选择方法与实证研究[J]. 计算机工程与应用, 2022, 58(11): 302-312. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||