计算机工程与应用 ›› 2018, Vol. 54 ›› Issue (17): 110-115.DOI: 10.3778/j.issn.1002-8331.1710-0279

• 模式识别与人工智能 • 上一篇    下一篇

基于低秩和图拉普拉斯的属性选择算法

曹再辉1,2,吴庆涛1,2,施进发2,3   

  1. 1.航空经济发展河南省协同创新中心,郑州 450015
    2.郑州航空工业管理学院,郑州 450015
    3.华北水利水电大学,郑州 450046
  • 出版日期:2018-09-01 发布日期:2018-08-30

Attribute selection algorithm based on low rank and Tula Laplace

CAO Zaihui1,2, WU Qingtao1,2, SHI Jinfa2,3   

  1. 1.Collaborative Innovation Center for Aviation Economy Development, Zhengzhou 450015, China
    2.Zhengzhou University of  Aeronautics, Zhengzhou 450015, China
    3.North China University of Water Resources and Electric Power, Zhengzhou 450046, China
  • Online:2018-09-01 Published:2018-08-30

摘要: 针对无监督属性选择算法使用单一方法,未考虑数据间内在相关性和噪声等问题,提出一种基于属性自表达的低秩无监督属性选择算法。算法首先将稀疏正则化([l2,1-]范数)引入属性自表达损失函数中实现无监督稀疏学习,其次在系数矩阵中加入低秩约束以降低噪声和离群点的影响,然后利用低秩结构和图拉普拉斯正则化使子空间学习兼顾数据的全局和局部结构,最后通过属性自表达实现无监督学习。经数据集上多次迭代验证,该算法能够快速收敛并达到全局最优,与SOGFS、PCA、LPP、RSR等四种算法相比分类准确率平均提高了16.11%、14.03%、9.92%和4.2%,并且在各数据集上互信息平均值也是最高的,说明该算法有效、高效。

关键词: 属性选择, 低秩约束, 图拉普拉斯, 子空间学习, 稀疏正则化

Abstract: Aiming at the problem that unsupervised attribute selection algorithm uses a single method without considering the intrinsic correlation and noise between the data, a low rank unsupervised attribute selection algorithm based on attribute self expression is proposed. Firstly, the sparse regularization is introduced into the attribute self expression loss function to realize unsupervised sparse learning, secondly, the low rank constraint is added to the algorithm coefficient matrix to reduce the influence of noise and outliers, then, the low rank structure and the Tula Laplace regularization are used in the algorithm to make the subspace study both the global and local structures of the data, finally, the algorithm achieves unsupervised learning through attribute self - expression. The experimental results show that the algorithm can converge quickly and achieve global optimum. Compared with SOGFS, PCA, LPP, RSR and other four kinds of algorithm, the classification accuracy rate respectively increases by 16.11%, 14.03%, 9.92% and 4.2%, and the average value of mutual information in each data set is the highest, indicating that the algorithm is effective and efficient.

Key words: attribute selection, low rank constraints, Tula Laplace, subspace learning, sparse regularization