Few Samples Data Augmentation Method Based on NVAE and OB-Mix

doi:10.3778/j.issn.1002-8331.2208-0326

Abstract

Abstract: Due to the high dependence of deep learning models on massive labeled data, many cutting-edge target detection theories are difficult to apply to the field of industrial detection. To this end, a small-sample data augmentation method based on NVAE for image generation and OB-Mix for data regularization is proposed. The specific method is to build a data distribution model of the detection target images through NVAE, and then generate new target images that belong to the same distribution as the real target images by sampling latent variables. After the generated target images are obtained, an OB-Mix data augmentation strategy is proposed, which mixes the generated target images with the background images at random positions to construct new images data, thereby improving the localization ability and generalization ability of the network. In the case of using only 474 labeled images and 400 background images without detection targets, the detection Precision of YOLOv5 reaches 95.86%, which is 17.60 percentage points higher than the training without this method.

Key words: data augmentation, small-sample, image generation, nouveau variational auto-encoder (NVAE), surface defect detection, deep learning

摘要： 由于深度学习模型对海量标注数据的依赖性较高，导致目前许多前沿性目标检测理论难以适用于工业检测领域。为此，提出一种基于NVAE图像生成和OB-Mix数据增强的小样本数据扩充方法。具体方法是通过NVAE构建检测目标的数据分布模型，再通过采样潜变量的方式生成与真实目标图像属于同一分布的全新目标图像。在得到生成目标图像后，提出了OB-Mix数据增强策略，将生成目标图像与背景图像进行随机位置融合以构建出新的图像数据，从而提高网络的定位能力及泛化能力。方法在仅使用474张标注图像以及400张无检测目标的背景图像情况下，使YOLOv5的检测精确率达到95.86%，相比于不使用该方法的结果提高了17.60个百分点。

关键词: 数据增强, 小样本, 数据生成, 新派变分自编码器（NVAE）, 表面缺陷检测, 深度学习

YANG Wei, ZHONG Mingfeng, YANG Gen, HOU Zhicheng, WANG Weijun, YUAN Hai. Few Samples Data Augmentation Method Based on NVAE and OB-Mix[J]. Computer Engineering and Applications, 2024, 60(2): 103-112.

杨玮, 钟名锋, 杨根, 侯至丞, 王卫军, 袁海. 基于NVAE和OB-Mix的小样本数据增强方法[J]. 计算机工程与应用, 2024, 60(2): 103-112.

References

[1] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the Conference on Computer Vision and Pattern Recognition, 2023: 7464-7475.
[2] LI Y, MAO H, GIRSHICK R, et al. Exploring plain vision transformer backbones for object detection[C]//Proceedings of the European Conference on Computer Vision, 2022: 280-296.
[3] CHEN T, SAXENA S, LI L, et al. Pix2seq: a language modeling framework for object detection[EB/OL]. [2022-03-27]. https://arxiv.org/abs/2109.10852.
[4] ZHANG H, CHANG H, MA B, et al. Dynamic R-CNN: towards high quality object detection via dynamic training[C]//ECCV 2020: 16th European Conference on Computer Vision, Glasgow, UK. Cham: Springer, 2020: 260-275.
[5] 陶显, 侯伟, 徐德. 基于深度学习的表面缺陷检测方法综述[J]. 自动化学报, 2021, 47(5): 1017-1034.
TAO X, HOU W, XU D. A survey of surface defect detection methods based on deep learning[J]. Acta Automatica Sinica, 2021, 47(5): 1017-1034.
[6] LI C, HUANG Y, LI H, et al. A weak supervision machine vision detection method based on artificial defect simulation[J]. Knowledge-Based Systems, 2020, 208(4): 106466-106476.
[7] HASELMANN M, GRUBER D. Supervised machine learning based surface inspection by synthetizing artificial defects[C]//ICMLA 2017: 16th IEEE International Conference on Machine Learning and Applications, Cancun, Mexico, 2017: 390-395.
[8] HUANG C C, LIN X P. Study on machine learning based intelligent defect detection system[J]. MATEC Web of Conferences, 2018, 201(3): 1-10.
[9] LIU L, CAO D, WU Y, et al. Defective samples simulation through adversarial training for automatic surface inspection[J]. Neurocomputing, 2019, 360: 230-245.
[10] CHOU Y C, KUO C J, CHEN T T, et al. Deep-learning-based defective bean inspection with GAN-structured automated labeled data augmentation in coffee industry[J]. Applied Sciences, 2019, 9(19): 4166-4192.
[11] SALIMANS T, GOODFELLOW I, ZAREMBA W, et al. Improved techniques for training GANs[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates Inc, 2016: 2234-2242.
[12] KARRAS T, AILA T, LAINE S, et al. Progressive growing of GANs for improved quality, stability, and variation[EB/OL]. [2018-02-26]. https://arxiv.org/abs/1710.10196.
[13] HUANG G, SUN Y, LIU Z, et al. Deep networks with stochastic depth[C]//ECCV 2016: 14th European Conference on Computer Vision, Amsterdam, The Netherlands. Cham: Springer, 2016: 646-661.
[14] SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(30): 1929-1958.
[15] SZEGEDY C, ANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016: 2818-2826.
[16] YUN S, HAN D, OH S J, et al. CutMix: regularization strategy to train strong classifiers with localizable features[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 2019: 6022-6031.
[17] ZHANG H, CISSE M, DAUPHIN Y N, et al. mixup: beyond empirical risk minimization[EB/OL]. [2018-04-27]. https://arxiv.org/abs/1710.09412.
[18] OLSSON V, TRANHEDEN W, PINTO J, et al. Classmix: segmentation-based data augmentation for semi-supervised learning[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2021: 1369-1378.
[19] VAHDAT A, KAUTZ J. NVAE: a deep hierarchical variational autoencoder[EB/OL]. [2021-01-08]. https://arxiv.org/abs/2007.03898.
[20] ULTRALYTICS. YOLOv5[EB/OL]. [2020-05-18]. https://github.com/ultralytics/yolov5.
[21] ZHAO Z Q, ZHENG P, XU S, et al. Object detection with deep learning: a review[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(11): 3212-3232.
[22] KINGMA D P, WELLING M. Auto-encoding variational Bayes[EB/OL]. [2014-05-01]. https://arxiv.org/abs/1312.6114.