Computer Engineering and Applications ›› 2017, Vol. 53 ›› Issue (5): 147-153.DOI: 10.3778/j.issn.1002-8331.1508-0090

Previous Articles     Next Articles

 General framework for constrained dimensionality reduction

YIN Xuesong, JIANG Rongrong, JIANG Lifei, SHI Jianhua   

  1. Department of Computer Science & Technology, Zhejiang Radio & TV University, Hangzhou 310030, China
  • Online:2017-03-01 Published:2017-03-03

基于成对约束的非线性维数约减框架

尹学松,蒋融融,江立飞,施建华   

  1. 浙江广播电视大学 计算机系,杭州 310030

Abstract: Semi-supervised dimensionality reduction refers to find the optimal low-dimensional structures from the original high-dimensional data in terms of the joint knowledge from side information and a large number of unlabeled instances. It has been regarded as an effective way to grasp the high-dimensional data such as gene sequence, text data and face images. In this paper, it develops a general framework for semi-supervised dimensionality reduction with pairwise constraints(SSPC). SSPC learns a discriminant adjacent matrix by using pairwise constraints and nearest neighbors of data. Then, it can learn a projection embedding the data from the original space to the low-dimensional space such that intra-cluster instances become even more nearby while extra-cluster instances become as far away from each other as possible. The proposed method can not only find a linear subspace which is optimal for discrimination, but also discover the nonlinear structure of the manifold. Experimental results on various real data sets demonstrate that SSPC is superior to established dimensionality reduction approaches.

Key words: dimensionality reduction, side information, pairwise constraints, prior membership degree, adjacent matrix

摘要: 半监督维数约简是指借助于辅助信息与大量无标记样本信息从高维数据空间找到一个最优低维判别空间,便于后续的分类或聚类操作,它被看作是理解基因序列、文本与人脸图像等高维数据的有效方法。提出一个基于成对约束的半监督维数约简一般框架(SSPC)。该方法首先通过使用成对约束和无标号样本的内在几何结构学习一个判别邻接矩阵;其次,新方法应用学到的投影将原来高维空间中的数据映射到低维空间中,以至于聚类内的样本之间距离变得更加紧凑,而不同聚类间的样本之间距离变得尽可能得远。所提出的算法不仅能找到一个最佳的线性判别子空间,还可以揭示流形数据的非线性结构。在一些真实数据集上的实验结果表明,新方法的性能优于当前主流基于成对约束的维数约简算法的性能。

关键词: 维数约简, 辅助信息, 成对约束, 先验隶属度, 邻接矩阵