Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (22): 58-73.DOI: 10.3778/j.issn.1002-8331.2406-0100

• Research Hotspots and Reviews • Previous Articles     Next Articles

Research Progress of Image Inpainting Methods Based on Deep Learning

CHEN Wenxiang, TIAN Qichuan, LIAN Lu, ZHANG Xiaohang, WANG Haoji   

  1. 1.College of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
    2.Beijing Key Laboratory of Intelligent Processing for Building Big Data, Beijing 100044, China
  • Online:2024-11-15 Published:2024-11-14

基于深度学习的图像修复方法研究进展

陈文祥,田启川,廉露,张晓行,王浩吉   

  1. 1.北京建筑大学 电气与信息工程学院,北京 100044
    2.建筑大数据智能处理方法研究北京市重点实验室,北京 100044

Abstract: Image inpainting is the process of recovering and repairing damaged or missing parts of an image through algorithms or techniques, which is a significant research focus in the field of computer vision. This paper reviews the development trajectory of deep learning-based image inpainting methods in recent years, and categorizes them into single-modal and multi-modal methods. The single-modal image inpainting methods are divided into convolutional autoencoder-based methods, GAN-based methods, Transformer-based methods and diffusion model-based methods. Meanwhile, the multi-modal image inpainting methods include text-guided methods, audio-guided methods, video-guided methods and multi-modal fusion-based methods. Furthermore, this paper provides a comparative analysis of the principles, advantages and disadvantages of various methods. It also introduces commonly used datasets and evaluation metrics, assesses the performance of representative methods on standard datasets, and discusses current challenges and future directions in this domain.

Key words: image inpainting, computer vision, deep learning

摘要: 图像修复是通过算法或技术对受损或缺失的图像进行恢复和修复的过程,是计算机视觉领域的研究热点之一。梳理了近些年基于深度学习的图像修复方法的发展脉络,将其分为单模态图像修复方法和多模态图像修复方法。单模态图像修复方法分为基于卷积自编码的图像修复方法、基于GAN的图像修复方法、基于Transformer的图像修复方法和基于扩散模型的图像修复方法,而多模态图像修复方法分为基于文本引导的图像修复方法、基于音频引导的图像修复方法、基于视频引导的图像修复方法和基于多模态融合的图像修复方法。对比分析了各类方法的原理和优缺点,介绍了常用数据集和评价指标,评估了代表性方法在常用数据集上的性能表现,并对该领域目前存在的挑战和未来的发展方向进行了分析和展望。

关键词: 图像修复, 计算机视觉, 深度学习