计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (22): 15-35.DOI: 10.3778/j.issn.1002-8331.2302-0273

• 热点与综述 • 上一篇    下一篇

图卷积神经网络及其在图像识别领域的应用综述

李文静,白静,彭斌,杨瞻源   

  1. 1.北方民族大学 计算机科学与工程学院,银川 750021
    2.国家民委图形图像智能处理实验室,银川 750021
  • 出版日期:2023-11-15 发布日期:2023-11-15

Graph Convolutional Neural Network and Its Application in Image Recognition

LI Wenjing, BAI Jing, PENG Bin, YANG Zhanyuan   

  1. 1.School of Computer Science and Engineering, North Minzu University, Yinchuan 750021, China
    2.The Key Laboratory of Images & Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan 750021, China
  • Online:2023-11-15 Published:2023-11-15

摘要: 卷积神经网络被广泛应用于图像识别领域并且展现出强大的特征提取能力,但它只能处理欧氏空间的结构化数据,无法适用于非结构化数据的处理。为应对该限制,图卷积神经网络利用谱域和空域方法,拓展了卷积运算的范围,使其能够在非欧几里德空间中进行特征学习,具备图数据的平移不变性,可以实现对非结构化图数据的表征学习。首先阐述了基于频域和空域的两种类型图卷积神经网络的基本原理,并且介绍了相关的改进工作;然后围绕图像识别领域,重点介绍了图卷积神经网络在多标签图像识别、基于骨架的动作识别和高光谱图像分类中的具体应用,总结其研究的最新进展,并对相关的模型进行了性能对比与分析;最后对全文内容进行总结,并对未来的发展方向进行展望。

关键词: 图像识别, 图卷积神经网络, 非欧氏空间, 深度学习, 人工智能

Abstract: Convolutional neural network has found widespread application in the field of image recognition, demonstrating remarkable feature extraction capabilities. However, it is inherently designed for processing structured data in Euclidean space, making it less suitable for handling unstructured data. To address this limitation, graph convolutional neural network leverages spectral and spatial methods to extend the scope of convolutional operations, enabling feature learning in non-Euclidean spaces. GCN possesses translational invariance for graph data, facilitating representation learning for unstructured data. Firstly, the basic principles and improvement work of two types of graph convolutional neural networks based on spectral domain and space domain are explained. Then, around the field of image recognition, the application of graph convolutional neural network in multi-label image recognition, skeleton-based action recognition and hyperspectral image classification is introduced, the research progress is summarized, and the performance comparison and analysis of related models are carried out. Finally, the content of the full text is summarized and the future development direction is looked forward.

Key words: image recognition, graph convolutional neural network, non-Euclidean space, deep learning, artificial intelligence