计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (20): 20-25.

• 热点与综述 • 上一篇    下一篇

基于霍夫森林和半监督学习的图像分类

王力冠,冯  瑞   

  1. 1.复旦大学 计算机科学技术学院,上海 201203
    2.上海市智能信息处理重点实验室(复旦大学),上海 201203
  • 出版日期:2016-10-15 发布日期:2016-10-14

Image classification based on Hough forest and semi-supervised learning

WANG Liguan, FENG Rui   

  1. 1.School of Computer Science and Technology, Fudan University, Shanghai 201203, China
    2.Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai 201203, China
  • Online:2016-10-15 Published:2016-10-14

摘要: 机器学习中的监督学习算法需要用有标记样本训练分类模型。而收集训练样本,并进行分类的过程,需要耗费大量人力物力以及时间。因此,如何高效率地完成图像分类一直是业内研究的热点。提出了一种基于霍夫森林和半监督学习的图像分类算法,能用较少的样本训练分类器,并在分类的过程中不断获取新的训练样本。并对部分训练结果加以人工标注,该方法有效提高了标注效率。利用COREL数据对该算法进行了实验验证,结果表明,该算法可以利用少量的训练样本,得到令人满意的标注精确度,提高人工效率。

关键词: 监督学习, 霍夫森林, 半监督学习, 直推式支持向量机, 图像分类

Abstract: In order to improve the algorithm accuracy, supervised learning algorithms often require a lot of manual annotation of samples. Sample labeling process takes a lot of manpower and time. Therefore, how to quickly complete image annotation industry has been a hot research. This paper presents a semi-supervised learning algorithm based on Hough forest, with a relatively small sample of training the classifier, and continues to get new training samples in the classification process, improves the labeling efficiency. The result of algorithm experiments on the dataset of COREL shows that the algorithm can take advantage of a small amount of training samples, satisfactory labeling accuracy.

Key words: supervised learning, Hough forest, semi-supervised learning, Transductive Support Vector Machine(TSVM), image classification