计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (21): 202-204.

• 图形、图像、模式识别 • 上一篇    下一篇

基于测地距离的半监督增强

刘志勇1,2,袁 媛3   

  1. 1.深圳职业技术学院 工业中心,广东 深圳 518055
    2.中山大学 数学与计算科学学院,广州 510275
    3.深圳职业技术学院 汽车与交通学院,广东 深圳 518055
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-07-21 发布日期:2011-07-21

Semi-supervised boosting based on geodesic distance

LIU Zhiyong1,2,YUAN Yuan3   

  1. 1.Industry Center,Shenzhen Polytechnic,Shenzhen,Guangdong 518055,China
    2.School of Mathematics and Computational Science,Sun Yat-Sen University,Guangzhou 510275,China
    3.School of Automotive and Transportation,Shenzhen Polytechnic,Shenzhen,Guangdong 518055,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-07-21 Published:2011-07-21

摘要: 在许多模式识别任务中,研究者常常使用有标记样本的信息,而忽略无标记样本信息,但在现实生活中有标记样本的获得可能需要花费大量的人力、物力、财力,而无标记数据的获得却相对容易得多。如何利用无标记的数据来增强分类器的性能成为近年来模式识别中的研究热点。在以往的半监督增强学习中,主要是根据无标记样本和有标记样本的相似度来利用无标记样本的,相似度主要使用欧氏距离来度量,而欧氏距离只反映样本间的空间位置关系,没有反映样本间的流形信息。因此,提出了基于测地距离的半监督增强学习算法,从而可以反映样本空间的流形信息。多个数据库上的实验结果表明提出算法的有效性。

关键词: 测地距离, 半监督学习, 流形, 增强

Abstract: In many pattern recognition tasks,people often use the labeled samples.But the labeled sample may be time consuming to obtain,and sometimes human effort is needed.Then it is expensive to get while unlabeled data is much cheaper to obtain.Therefore,utilizing unlabeled data to boost the classifier has received a significant interest in pattern recognition in recent years.In semi-supervised learning,the unlabeled data is taken into account by the similarity between unlabeled data and labeled data.In the usual semi-boosting,people use the Euclidean distance to compute the similarity.However,the Euclidean distance only reflects the spatial relationship and ignores the manifold information.So this paper presents a semi-supervised boosting algorithm based on the geodesic distance,and then the manifold information in the sample space is reflected.The experimental results on the public data sets reveal that the proposed method can get encouraging recognition accuracy.

Key words: geodesic distance, semi-supervised learning, manifold, boosting