Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (3): 184-192.DOI: 10.3778/j.issn.1002-8331.2204-0201

• Graphics and Image Processing • Previous Articles     Next Articles

Adaptive Feature Fusion Embedding Network for Few Shot Fine-Grained Image Classification

XIE Yaohua, ZHANG Weichuan, REN Jie, JING Junfeng   

  1. 1.School of Electronic Information, Xi’an Polytechnic University, Xi’an 710600, China
    2.Xi’an Polytechnic University Branch of Shaanxi Artificial Intelligence Joint Laboratory, Xi’an 710600, China
  • Online:2023-02-01 Published:2023-02-01

基于自适应特征融合的小样本细粒度图像分类

解耀华,章为川,任劼,景军锋   

  1. 1.西安工程大学 电子信息学院,西安 710600
    2.陕西省人工智能联合实验室 西安工程大学分部,西安 710600

Abstract: The existing few shot learning algorithms cannot fully extract the features of fine-grained images, leading to the low classification accuracy of fine-grained image. In order to better model the features extracted from the few shot fine-grained image classification(FSFGIC) algorithms, an adaptive feature fusing FSFGIC algorithm is proposed in this paper. Firstly, an adaptive feature fusion embedded network, which can extract deep semantic features and shallow location structure features, and extract key features using adaptive algorithm and attention mechanism, is designed for feature extraction. Secondly, a single image training and multi-image training methods are used to train the feature extraction network successively, which focus on the relationship between a pair of images. Finally, in order to make the distance of the same class of feature vectors in the feature space closer, and the distance of the feature vectors of different classes is greater, the feature distribution conversion, quadrature right trigonometric decomposition and normalization process are performed on the extracted feature vectors. In this paper, the algorithm is compared with 9 other algorithms, and the accuracy rate of 5 way 1 shot and 5 way 5 shot is evaluated on multiple fine-grained datasets. The accuracies are improved by 5.27 and 2.90?percentage points on the Stanford Dogs dataset, 3.29 and 4.23?percentage points on the Stanford Cars dataset, and the accuracy of the 5 way 1 shot on the CUB-200 dataset is only slightly 0.82?percentage points lower than that of DLG, but the 5 way 5 shot is improved by 1.55?percentage points.

Key words: few shot learning, fine-grained image classification, adaptive feature fusion, attention mechanism

摘要: 现有的小样本学习算法未能充分提取细粒度图像的特征,导致细粒度图像分类准确率较低。为了更好地对基于度量的小样本细粒度图像分类算法中提取的特征进行建模,提出了一种基于自适应特征融合的小样本细粒度图像分类算法。在特征提取网络上设计了一种自适应特征融合嵌入网络,可以同时提取深层的强语义特征和浅层的位置结构特征,并使用自适应算法和注意力机制提取关键特征。在训练特征提取网络上采用单图训练和多图训练方法先后训练,在提取样本特征的同时关注样本之间的联系。为了使得同一类的特征向量在特征空间中的距离更加接近,不同类的特征向量的距离更大,对所提取的特征向量做特征分布转换、正交三角分解和归一化处理。提出的算法与其他9种算法进行实验对比,在多个细粒度数据集上评估了5 way 1 shot的准确率和5 way 5 shot的准确率。在Stanford Dogs数据集上的准确率提升了5.27和2.90个百分点,在Stanford Cars数据集上的准确率提升了3.29和4.23个百分点,在CUB-200数据集上的5 way 1 shot的准确率只比DLG略低0.82个百分点,但是5 way 5 shot上提升了1.55个百分点。

关键词: 小样本学习, 细粒度图像分类, 自适应特征融合, 注意力机制