Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (6): 1-16.DOI: 10.3778/j.issn.1002-8331.2106-0442

• Research Hotspots and Reviews • Previous Articles     Next Articles

Research Progress of Transformer Based on Computer Vision

LIU Wenting, LU Xinming   

  1. College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, Shandong 266500, China
  • Online:2022-03-15 Published:2022-03-15



  1. 山东科技大学 计算机科学与工程学院,山东 青岛 266500

Abstract: Transformer is a deep neural network based on the self-attention mechanism and parallel processing data. In recent years, Transformer-based models have emerged as an important area of research for computer vision tasks. Aiming at the current blanks in domestic review articles based on Transformer, this paper covers its application in computer vision. This paper reviews the basic principles of the Transformer model, mainly focuses on the application of seven visual tasks such as image classification, object detection and segmentation, and analyzes Transformer-based models with significant effects. Finally, this paper summarizes the challenges and future development trends of the Transformer model in computer vision.

Key words: Transformer, computer vision, self-attention mechanism, neural network

摘要: Transformer是一种基于自注意力机制、并行化处理数据的深度神经网络。近几年基于Transformer的模型成为计算机视觉任务的重要研究方向。针对目前国内基于Transformer综述性文章的空白,对其在计算机视觉上的应用进行概述。回顾了Transformer的基本原理,重点介绍了其在图像分类、目标检测、图像分割等七个视觉任务上的应用,并对效果显著的模型进行分析。最后对Transformer在计算机视觉中面临的挑战以及未来的发展趋势进行了总结和展望。

关键词: Transformer, 计算机视觉, 自注意力机制, 神经网络