Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (2): 70-76.DOI: 10.3778/j.issn.1002-8331.1912-0357

Previous Articles     Next Articles

Self-Training Algorithm Combining Density Peak and Cut Edge Weight

WEI Danni, YANG Youlong, QIU Haiquan   

  1. 1.School of Mathematics and Statistics, Xidian University, Xi’an 710071, China
    2.College of Information & Network Engineering, Anhui Science and Technology University, Bengbu, Anhui 233030, China
  • Online:2021-01-15 Published:2021-01-14



  1. 1.西安电子科技大学 数学与统计学院,西安 710071
    2.安徽科技学院 信息与网络工程学院,安徽 蚌埠 233030


In view of the influence of mislabeled samples on the performance of self-training algorithm in the process of iteration, a self-training algorithm based on density peak and cut edge weight is proposed. Firstly, the representative unlabeled samples are selected for labels prediction by space structure, which is discovered by clustering method based on density of data. Secondly, cut edge weight is used as statistics to make hypothesis testing. This technique is for identifying whether samples are labeled correctly. And then the set of labeled data is gradually enlarged until all unlabeled samples are labeled. The proposed method not only makes full use of space structure information, but also solves the problem that some data may be classified incorrectly. Thus, the classification accuracy of algorithm is improved in a great measure. Extensive experiments on real datasets clearly illustrate the effectiveness of proposed method.

Key words: self-training, density peak, cut edge weight, hypothesis testing



关键词: 自训练, 密度峰值, 切边权值, 假设检验