计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (6): 214-221.DOI: 10.3778/j.issn.1002-8331.2210-0361

• 图形图像处理 • 上一篇    下一篇

常识辅助细粒度数据增强方法

李华超,康彬,王磊   

  1. 1.南京邮电大学 物联网学院,南京 210003
    2.南京邮电大学 通信与信息工程学院,南京 210003
  • 出版日期:2024-03-15 发布日期:2024-03-15

Commonsense Oriented Fine-Grained Data Augmentation

LI Huachao, KANG Bin, WANG Lei   

  1. 1.School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
    2.School of Communications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
  • Online:2024-03-15 Published:2024-03-15

摘要: 数据增强的代表性研究主要针对ImageNet等常规分类数据集展开。考虑到细粒度识别数据集中测试样本类内及类间关联性与常规分类数据集差异明显,因此针对细粒度识别的数据增强方法尚待深入研究。为此,从细粒度识别任务以及数据集的特殊属性入手提出基于常识辅助的细粒度语义图块混合策略。所提方法利用常识知识挖掘样本标签间潜在关联,以此为基础设计结构化图像混合策略的多支路卷积神经网络结构,使图像混合过程更关注目标的细微差异。通过大量性能测试可验证所提方法的性能明显优于主流的基于图像混合的数据增强方法。同时,通过实验验证,所提出的常识知识有助于多种基于混合图像类的数据增强模型性能提升。

关键词: 数据增强, 常识图谱, 多支路卷积神经网络

Abstract: The representative researches on data augmentation are mainly carried out on common classification benchmark datasets such as ImageNet. Considering intra-class and inter-class relation in fine-grained visual classification(FGVC) datasets is so different from ordinary classification datasets, data augmentation methods for FGVC need to be further studied. Therefore, this paper proposes a fine-grained semantic image patch mixing method by commonsense(ComSipmix), starting from the fine-grained recognition task and the special properties of the dataset. The proposed method exploits common sense knowledge to mine potential associations between sample labels, and designs a multi-branch convolutional neural network structure for structured image mixing strategy based on this, so that the image mixing process pays more attention to the subtle differences of targets.  Through extensive performance tests, it can be verified that the performance of the proposed method is significantly better than the mainstream image mixing-based data augmentation methods. At the same time, through experimental verification, the common sense knowledge proposed in this paper helps to improve the performance of various data augmentation models based on mixed image classes.

Key words: data augmentation, commonsense map, multi-branch convolutional neural network