Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (23): 202-210.DOI: 10.3778/j.issn.1002-8331.2206-0470

• Graphics and Image Processing • Previous Articles     Next Articles

Multi-Label Car Damage Image Generation Based on Few Shot StyleGAN

DING Kai, YANG Jiaxi, YANG Yao, NA Chongning   

  1. Financial Technology Research Center of Zhejiang Lab, Hangzhou 311000,China
  • Online:2023-12-01 Published:2023-12-01

基于小样本StyleGAN的多类别车损图像生成方法

丁锴,杨佳熹,杨耀,那崇宁   

  1. 之江实验室,杭州 311121

Abstract: Existing multi-class damaged car image datasets have problems such as limited sample number, insufficient and unbalanced class distribution, which can be relieved by image generation. StyleGAN can generate new images with high resolution without distortion, and has been proven effective especially on medical and face images, however less research has been done on few-shot learning and high sample diversity. This paper investigates the few shot StyleGAN generation method for car damage images with high sample diversity. It first parametrically analyzes and optimizes those key factors that affect the convergence properties of the adversarial generative models on limited sample number, such that an FID of 41.3 is achieved on 1?500 number of 128×128?pixel damaged car image generation. Based on the general adversarial generative model, it proposes three damaged car image generation models, i.e., random generation, style-based generation and decoupling-based generation schemes. Effectiveness of the generated dataset on improving damaged car image classification is demonstrated by experimental results, which verifies the usefulness of the adversarial generative models. It further investigates latent vector decoupling in the generation space and the actual physical meaning of the decoupling directions. It also analyzes the differences and reasons for the improvement of image classification tasks by different image generation methods. Those analysis results provide insights in the further improvement of the generation models.

Key words: image generation, adversarial generative network, few shot learning, data augmentation

摘要: 现有车损图像数据集存在样本量少、多样性不足、分布不均衡等问题,这些问题可通过图像生成缓解。StyleGAN是较新的能生成高分辨率且不失真图像的方法,被证明对医学和人脸图像增强有效,但针对小样本和多样性较强的样本的研究较少。针对车损图像,研究小样本、高样本多样性的条件StyleGAN生成方法。针对有限车损图像样本对抗模型训练过程中影响模型收敛的因素,进行参数分析及优化,在约1?500个样本、128×128分辨率的多类别车损图像数据集上将FID值降低到41.3,解决了传统方法因样本较少导致模型不收敛的问题。在此基础上构建了随机生成、样式混合生成及解耦放缩生成等三种基于对抗模型的多类别车损图像生成方法。基于此三种图像生成方法实现对车损图像训练集的扩增,并通过数值实验证明了其对下游图像分类任务的有效性。研究了生成模型的空间潜向量解耦方法,并分析解耦方向的实际物理含义以及不同图像生成方式对图像分类任务提升效果的差异及原因,对未来进一步提升对抗模型的多类别车损图像生成方法提供了一些线索和依据。数据集与代码已公开于https://github.com/derby-ding/StyleGAN_cardemage_class。

关键词: 图像生成, 对抗生成网络, 小样本学习, 数据增强