Semi-supervised Learning Method via Wasserstein Distance Under Small Sample Condition

doi:10.3778/j.issn.1002-8331.2010-0353

Abstract

Abstract: In recent years, the deep neural network model based on large-scale labeled data sets has shown advanced performance in the image field, but a large number of labeled data are expensive and difficult to collect. To make better use of unlabeled data, in this paper, a semi-supervised learning method, Wasserstein consistency training（WCT）, is proposed in which Jensen-Shannon divergence is introduced to simulate consistency training and organize massive unlabeled data to improve the efficiency of consistency training. Adversarial samples are generated to encourage the difference of views through the adversarial attack imposed by the fast gradient sign method, and Wasserstein distance is used as the measure of network difference constraint to prevent the deep neural network from collapsing and make the network output smoothly on the low dimensional manifold.Experimental results show that the error rate of the proposed method is 0.85% in MNIST and 11.96% in CIFAR-10 with only 4?000 labeled data, which proves that the proposed method has better performance in semi-supervised image classification under the small samplecondition.

Key words: small sample, semi-supervised learning, adversarial samples, deep neural network

摘要： 近年来，基于大规模标记数据集的深度神经网络模型在图像领域展现出优秀的性能，但是大量标记数据昂贵且难以收集。为了更好地利用无标记数据，提出了一种半监督学习方法Wasserstein consistency training（WCT）, 通过引入Jensen-Shannon散度来模拟协同训练并组织大量未标记数据来提高协同训练效率，通过快速梯度符号攻击施加的对抗攻击来生成对抗样本以鼓励视图的差异，将Wasserstein距离作为网络差异约束的度量，以防止深度神经网络崩溃，使网络在低维流形空间上平滑输出。实验结果表明，所提方法在MNIST分类错误率为0.85%，在仅使用4?000个标记数据的CIFAR-10数据集上错误率达到11.96%，证明了所提方法在小样本条件下的半监督图像分类中具有较好的性能。

关键词: 小样本, 半监督学习, 对抗样本, 深度神经网络

MA Menghao, WANG Zhe. Semi-supervised Learning Method via Wasserstein Distance Under Small Sample Condition[J]. Computer Engineering and Applications, 2022, 58(5): 193-199.

马幪浩, 王喆. 小样本下基于Wasserstein距离的半监督学习算法[J]. 计算机工程与应用, 2022, 58(5): 193-199.

References

[1] 韩嵩，韩秋弘.半监督学习研究的述评[J].计算机工程与应用，2020，56（6）：19-27.
HAN S，HAN Q H.Review of semi-supervised learning research[J].Computer Engineering and Applications，2020，56（6）：19-27.
[2] ZHU X，GOLDBERG A.Introduction to semi-supervised learning[M]//Synthesis lectures on artificial intelligence and machine learning.[S.l.]：Morgan & Claypool，2009.
[3] FIERIMONTE R，SCARDAPANE S，UNCINI A，et al.Fuly decentralized semi-supervised learning via privacy-preserving matrix completion[J].IEEE Transactions on Neural Networks and Learning Systems，2016，28（11）：2699-2711.
[4] MILLER D，ANDUVAR H.A mixture of experts classifier with learning based on both labelled and unlabelled data[C]//Advances in Neural Information Processing Systems，1997：571-577.
[5] NIGAM K，MCCALLUM A，THRUN S，et al.Text classification from labeled and unlabeled documents using EM[J].Machine Learning，2000，39（2/3）：103-134.
[6] VAPNIK V.An overview of statistical learning theory[J].IEEE Transactions on Neural Networks，1999，10（5）：988-999.
[7] SUN S，XIE X.Semi-supervised support vector machines with tangent space intrinsic manifold regularization[J].IEEE Transactions on Neural Networks and Learning Systems，2015，27（9）：1827-1839.
[8] ZHANG K，LAN L，KWOK J，et al.Scaling up graph-based semisupervised learning via prototype vector machines[J].IEEE Transactions on Neural Networks and Learning Systems，2014，26（3）：444-457.
[9] WESTON J，RATLE F，MOBAHI H，et al.Deep learning via semi-supervised embedding[C]//Neural Networks：Tricks of the Trade，2012：639-655.
[10] LUO Y，ZHU J，LI M，et al.Smooth neighbors on teacher graphs for semi-supervised learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：8896-8905.
[11] LI Q，HAN Z，WU X.Deeper insights into graph convolutional networks for semi-supervised learning[C]//AAAI Conference on Artificial Intelligence，2018：3538-3545.
[12] S?REL? J，VALPOLA H.Denoising source separation[J].Journal of Machine Learning Research，2005，6（2）：233-272.
[13] RASMUS A，BERGLUND M，HONKALA M，et al.Semi-supervised learning with ladder networks[C]//Advances in Neural Information Processing Systems，2015：3546-3554.
[14] CHENG Y，ZHAO X，CAI R，et al.Semi-supervised multimodal deep learning for RGB-D object recognition[C]//The International Joint Conference on Artificial Intelligence，2016：3345-3351.
[15] LAINE S，AILA T.Temporal ensembling for semi-supervised learning[C]//International Conference on Learning Representations，2017：53-59.
[16] TARVAINEN A，VALPOLA H.Mean teachers are better role models：weight-averaged consistency targets improve semi-supervised deep learning results[C]//Advances in Neural Information Processing Systems，2017：1195-1204.
[17] QIAO S，SHEN W，ZHANG Z，et al.Deep co-training for semi-supervised image recognition[C]//Proceedings of the European Conference on Computer Vision，2018：135-152.
[18] WANG W，CHEN D，ZHOU Z，et al.Tri-net for semi-supervised deep learning[C]//The International Joint Conference on Artificial Intelligence，2018：2014-2020.
[19] GOODFELLOW I，SHLENS J，SZEGEDY C.Explaining and harnessing adversarial examples[C]//International Conference on Learning Representations，2015：1-3.
[20] KURAKIN A，GOODFELLOW I，BENGIO S.Adversarial examples in the physical world[C]//International Conference on Learning Representations，2017.
[21] PAPERNOT N，MCDANIEL P，JHA S，et al.The limitations of deep learning in adversarial settings[C]//IEEE European Symposium on Security and Privacy，2016：372-387.
[22] MOHSEN S，DEZFOOLI M，FAWZI A，et al.Deepfool：a simple and accurate method to fool deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：2574-2582.
[23] GOODFELLO I，POUGET-ABADIE J，MIRZA M，et al.Generative adversarial nets[C]//Advances in Neural Information Processing Systems，2014：2672-2680.
[24] SPRINGENBERG J.Unsupervised and semi-supervised learning with categorical generative adversarial networks[C]//International Conference on Learning Representations，2016：27-34.
[25] SALIMANS T，GOODFELLOW I，ZAREMBA W，et al.Improved techniques for training GANs[C]//Advances in Neural Information Processing Systems，2016：2234-2242.
[26] CHONG L，XU T，ZHU J，et al.Triple generative adversarial nets[C]//Advances in Neural Information Processing Systems，2017：4088-4098.