Computer Engineering and Applications ›› 2016, Vol. 52 ›› Issue (15): 79-82.

Previous Articles     Next Articles

New Laplacian-based feature selection method

QIAN Xiaoliang1,2, ZUO Kaizhong1,2, JIE Biao1,2   

  1. 1.School of Mathematics and Computer Science, Anhui Normal University, Wuhu, Anhui 241003, China
    2.Network and Information Security Engineering Technology Research Center, Anhui Normal University, Wuhu, Anhui 241003, China
  • Online:2016-08-01 Published:2016-08-12

新的基于Laplacian的特征选择方法

钱晓亮1,2,左开中1,2,接  标1,2   

  1. 1.安徽师范大学 数学计算机科学学院,安徽 芜湖 241003
    2.安徽师范大学 网络与信息安全工程技术研究中心,安徽 芜湖 241003

Abstract: Among feature selections, Lasso method has been widely studied and applied. However, a main disadvantage of Lasso method is that it only considers the relationship between subject and label, and ignores the distribution information of subjects which may help to induce more discriminative features. To address this problem, this paper proposes a new Laplacian-based feature selection method called Lap-Lasso which can simultaneously achieve feature selection and preserve the intrinsic relatedness among subjects. Specifically, two regularization items are included in the proposed model. The first item is sparsity regularizer which ensures only a small number of features to be selected. In addition, to capture the intrinsic relatedness among subjects, it introduces a new Laplacian-based regularization item, which help to induce more discriminative features. Experimental results on UCI datasets show that the proposed algorithm can achieve better performances than conventional feature selection algorithms.

Key words: feature selection, Laplacian regularization, Lasso, support vector machine, dimensionality reduction

摘要: 在各种特征选择方法中,Lasso的方法取得了广泛的研究和应用。然而,利用Lasso进行特征选择的一个主要缺点是只考虑了样本和类标签之间的相关性,却忽略了样本自身的内在关联信息,而这些信息有助于诱导出更具有判别力的特征。为了解决这个问题,提出了一种新的基于Laplacian的特征选择方法,称之为Lap-Lasso。提出的Lap-Lasso方法首先包含一个稀疏正则化项,用于保证只有少数量特征能被选择。另外,引入了一个新的基于Laplacian的正则化项,用于保留同类样本之间的几何分布信息,从而帮助诱导出更具判别力的特征。在UCI数据集的实验结果验证了Lap-Lasso方法的有效性。

关键词: 特征选择, Laplacian正则化项, Lasso, 支持向量机, 降维