计算机工程与应用 ›› 2015, Vol. 51 ›› Issue (3): 129-132.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

协同半监督的构造性学习方法

李  萍1,吴  涛2   

  1. 1.阜阳师范学院 信息工程学院,安徽 阜阳 236041
    2.安徽大学 智能计算与信号处理教育部重点实验室,合肥 230039
  • 出版日期:2015-02-01 发布日期:2015-01-28

Constructive learning method based on co-training algorithm

LI Ping1, WU Tao2   

  1. 1.College of Information Engineering, Fuyang Teacher’s College, Fuyang, Anhui 236041, China
    2.Key Laboratory of Intelligent Computing & Signal Processing, Ministry of Education, Anhui University, Hefei 230039, China
  • Online:2015-02-01 Published:2015-01-28

摘要: 利用构造性学习(CML)算法训练分类器需要大量已标记样本,然而获取大量已标记的样本较为困难。为此,提出了一种协同半监督的构造性学习算法。将已标记样本等分为三个训练集,分别使用构造性学习算法训练三个单分类器,以共同投票的方式对未标记样本进行标记,从而依次扩充三个单分类器训练集直到不能再扩充为止。将三个训练集合并训练出最终的分类器。选取UCI数据集进行实验,结果表明,与CML算法、Tri-CML算法、NB算法及Tri-NB相比,该方法的分类更为有效。

关键词: 半监督学习, 构造性机器学习, co-training算法, tri-training算法, 覆盖算法

Abstract: Constructive Machine Learning(CML) algorithm needs larger number of labeled samples to train a classification network, but it is difficult to obtain a mass of labeled samples. So a constructive learning method based on co-training algorithm is designed. It divides the limited labeled samples into three equal training sets and uses CML algorithm to train three single classifiers respectively. The unlabeled samples are labeled by voting together through the three single classifiers to expand the three training sets in turn until they are not expanded. The expanded three training sets are united to generate the final classifier. Experiment is conducted on UCI data set and results show that the algorithm is more effective than the CML, Tri-CML, NB and Tri-NB algorithms.

Key words: semi-supervised, constructive machine learning, co-training algorithm, tri-training algorithm, covering