结合半监督聚类和加权KNN的协同训练方法

doi:10.3778/j.issn.1002-8331.1807-0159

计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (22): 114-118.DOI: 10.3778/j.issn.1002-8331.1807-0159

结合半监督聚类和加权KNN的协同训练方法

龚彦鹭，吕佳

重庆师范大学计算机与信息科学学院，重庆 401331

出版日期:2019-11-15 发布日期:2019-11-13

Co-Training Method Combined with Semi-Supervised Clustering and Weighted [K]-Nearest Neighbor

GONG Yanlu, LV Jia

College of Computer and Information Science, Chongqing Normal University, Chongqing 401331, China

Online:2019-11-15 Published:2019-11-13

摘要/Abstract

摘要： 针对协同训练方法在迭代时选择加入的无标记样本所隐含的有用信息不够，以及协同训练方法多个分类器标记不一致带来错误标记无标记样本的问题，提出了一种结合半监督聚类和加权[K]最近邻的协同训练方法。该方法在每次迭代过程中，先对训练集进行半监督聚类，选择隶属度高的无标记样本给朴素贝叶斯分类，再用加权[K]最近邻算法对多个分类器分类不一致的无标记样本重新分类。利用半监督聚类能够选择出较好表现数据空间结构的样本，而采用加权[K]最近邻算法为标记不一致的无标记样本重新标记能够解决标记不一致带来的分类精度降低问题。在UCI数据集上的对比实验验证了该算法的有效性。

关键词: 协同训练, 半监督聚类, 加权[K]最近邻, 视图

Abstract: In the process of co-training iteration, the lack of useful information implied by the selection of unmarked samples and the inconsistency of multiple classifier markers will lead to the unmarked samples of error marks. Aiming at the above questions, this paper proposes a co-training method combined with a semi-supervised clustering and the weighted [K]-nearest neighbor. In the process of each iteration, the method first carries out a semi-supervised clustering on the training set, chooses the unmarked samples with high membership degree to the naive Bayes classification, and then uses the weighted [K]-nearest neighbor algorithm to reclassify the inconsistent unmarked samples classified by multiple classifier. Using a semi-supervised clustering can choose the better performance data of the space structure of samples, and using the weighted [K]-nearest neighbor algorithm to mark the inconsistent unmarked samples can solve the problem of classification accuracy degradation caused by inconsistent marking. The comparison experiment on UCI dataset verifies the validity of the algorithm.

Key words: co-training, semi-supervised clustering, weighted [K]-nearest neighbor, view

龚彦鹭，吕佳. 结合半监督聚类和加权KNN的协同训练方法[J]. 计算机工程与应用, 2019, 55(22): 114-118.

GONG Yanlu, LV Jia. Co-Training Method Combined with Semi-Supervised Clustering and Weighted [K]-Nearest Neighbor[J]. Computer Engineering and Applications, 2019, 55(22): 114-118.

[1]	梁芳烜，杨锋，卢丽云，尹梦晓. 基于卷积神经网络的脑肿瘤分割方法综述[J]. 计算机工程与应用, 2021, 57(7): 34-43.
[2]	潘婷，杨秋翔，景志宇. 基于球体投影的三维模型检索[J]. 计算机工程与应用, 2020, 56(7): 240-246.
[3]	韩嵩，韩秋弘. 半监督学习研究的述评[J]. 计算机工程与应用, 2020, 56(6): 19-27.
[4]	汤宁，卫泽良，张瑞，易东，伍亚舟. 基于多尺度多模式图像的肺结节分类对比研究[J]. 计算机工程与应用, 2020, 56(3): 165-175.
[5]	朱丹，陈晓红，吴卿源，李舜酩. 自适应图学习诱导的子空间聚类[J]. 计算机工程与应用, 2020, 56(21): 30-37.
[6]	郭圣，仲兆满，李存华. 基于深度自编码的多视图子空间聚类网络[J]. 计算机工程与应用, 2020, 56(17): 60-68.
[7]	王得雪，林意，陈俊杰. 协同训练算法在滚动轴承故障诊断中的应用[J]. 计算机工程与应用, 2020, 56(12): 273-278.
[8]	张文琦，周喜，赵凡，马博. 基于多维时序日志的异常行为可视分析[J]. 计算机工程与应用, 2020, 56(10): 231-239.
[9]	牛丹丹1，段宗涛1，2，陈柘1，2，康军1，2，朱依水1，2，唐蕾1，2，葛建东3，江华3. 城市出租车乘客出行特征可视化分析方法[J]. 计算机工程与应用, 2019, 55(6): 237-243.
[10]	董西伟1，2，王玉伟3，周军1. 鲁棒多视图协同完整鉴别子空间学习算法[J]. 计算机工程与应用, 2019, 55(3): 108-114.
[11]	王小玉1，丁世飞1，2. 基于共享近邻的成对约束谱聚类算法[J]. 计算机工程与应用, 2019, 55(2): 142-147.
[12]	徐霜1，余琍2. 利用正则化矩阵分解技术的多视图聚类方法[J]. 计算机工程与应用, 2019, 55(14): 142-147.
[13]	姚琼1，徐翔1，2，邹昆1. 基于3D Gabor多视图主动学习的高光谱图像分类[J]. 计算机工程与应用, 2018, 54(22): 197-204.
[14]	卢月明1，王亮1，仇阿根1，张用川1，2，赵阳阳1. 基于半监督学习的克里金插值方法[J]. 计算机工程与应用, 2018, 54(22): 265-270.
[15]	苏辉1，葛洪伟1，2，张涛1，杨金龙1. 基于视图相关因子的多视图数据竞争聚类算法[J]. 计算机工程与应用, 2017, 53(3): 100-105.

结合半监督聚类和加权KNN的协同训练方法

Co-Training Method Combined with Semi-Supervised Clustering and Weighted [K]-Nearest Neighbor

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics