Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (19): 53-63.DOI: 10.3778/j.issn.1002-8331.2201-0374

• Research Hotspots and Reviews • Previous Articles     Next Articles

Review of Applications of CNN and Transformer in Fine-Grained Image Recognition

MA Yao, ZHI Min, YIN Yanjun, PING Ping   

  1. College of Computer Science and Technology, Inner Mongolia Normal University, Hohhot 010022, China
  • Online:2022-10-01 Published:2022-10-01



  1. 内蒙古师范大学 计算机科学技术学院,呼和浩特 010022

Abstract: Fine grained image recognition aims to distinguish subcategories from category images. This makes the recognition task challenging as there are only subtle differences between images. With the continuous progress of deep learning technology, the ability of locating local and representing features based on deep learning methods is becoming stronger and stronger. Among them, various algorithms based on convolutional neural network(CNN)and transformer greatly improve the accuracy of fine-grained image recognition, and the field of fine-grained image has been significantly developed. In order to sort out the development of the two methods in the field of fine-grained image recognition, the methods that only use category labels in this field in recent years are reviewed. Firstly, the concept of fine-grained image recognition is introduced, and the mainstream fine-grained image data set is described in detail. Secondly, the fine-grained image recognition method based on convolutional neural network and visual transformer and its performance are introduced. Finally, the future research direction of fine-grained image recognition is summarized.

Key words: fine-grained image recognition, deep learning, convolutional neural network(CNN), Transformer

摘要: 细粒度图像识别旨在从类别图像中辨别子类别。由于图像间只有细微差异,这使得识别任务具有挑战性。随着深度学习技术的不断进步,基于深度学习的方法定位局部和表示特征的能力越来越强,其中以卷积神经网络(CNN)和Transformer为基础的各类算法大大提高了细粒度图像识别精度,细粒度图像领域得到了显著发展。为了整理两类方法在细粒度图像识别领域的发展历程,对该领域近年来只运用类别标签的方法进行了综述。介绍了细粒度图像识别的概念,详细阐述了主流细粒度图像数据集;介绍了基于CNN和Transformer的细粒度图像识别方法及其性能;最后,总结了细粒度图像识别未来的研究方向。

关键词: 细粒度图像识别, 深度学习, 卷积神经网络, Transformer