计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (15): 151-159.DOI: 10.3778/j.issn.1002-8331.2204-0278

• 模式识别与人工智能 • 上一篇    下一篇

多尺度特征聚合的小样本学习方法

曾武,毛国君   

  1. 1.福建工程学院 计算机科学与数学学院,福州 350118
    2.福建省大数据挖掘与应用重点实验室,福州 350118
  • 出版日期:2023-08-01 发布日期:2023-08-01

Few-Shot Learning Method for Multi-Scale Feature Aggregation

ZENG Wu, MAO Guojun   

  1. 1. College of Computer Science and Mathematics, Fujian University of Technology, Fuzhou 350118, China
    2. Fujian Key Laboratory of Big Data Mining and Applications, Fuzhou 350118, China
  • Online:2023-08-01 Published:2023-08-01

摘要: 针对大多数小样本学习在特征提取中存在特性信息提取不足、难以准确地提取样本中的重要特征信息以及类内样本多样性可能导致类中心点偏离等问题。提出一种多尺度特征聚合的小样本学习方法(MSFA)。具体来说,该方法利用多尺度生成模块生成关于全部训练样本的多种不同尺度的特征信息,使用自注意力聚合不同尺度的重要特征信息,并将不同尺度的重要特征信息进行拼接,以此来实现关于图像更为准确的特征表达。分别计算每个查询集样本与类原型的距离以及与类内各样本间距离的平均值,并以加权方式得出最终距离。在miniImageNet、tiered-ImageNet和Standford Dogs三个数据集上进行大量的实验,实验结果表明:提出的方法可以大幅提升基线方法的分类性能,特别是在miniImageNet数据集上,在5-way 1-shot和5-way 5-shot设置中,相较于Prototypical Network方法,分类准确率分别提升7.42和6.28个百分点。

关键词: 小样本学习, 特征增强, 自注意力机制, 多尺度特征融合, 图像分类

Abstract: For most few-shot learning, there are problems such as insufficient feature information extraction in feature extraction, difficulty in accurately extracting important feature information in samples, and the diversity of samples within a class may lead to the deviation of class center points. A few-shot learning method(MSFA) for multi-scale feature aggregation is proposed. Specifically, the method uses a multi-scale generation module to generate feature information of multiple different scales about all training samples, secondly self-attention is used to aggregate important feature information of different scales, and important feature information of different scales are spliced. This achieves a more accurate feature representation about the image. Finally, the distance between each query set sample and the class prototype and the average distance from each sample in the class are calculated separately, and the final distance is obtained in a weighted manner. A large number of experiments are carried out on the three datasets of miniImageNet, tieredImageNet and Standford Dogs. The experimental results show that the poposed method can greatly improve the classification performance of the baseline method, especially on the miniImageNet dataset, in 5-way 1-shot and 5-way 5-shot setting, compared with the Prototypical Network method, the classification accuracy is improved by 7.42 and 6.28 percentage points, respectively.

Key words: few-shot learning, feature enhancement, self-attention mechanism, multi-scale feature fusion, image classification