计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (16): 236-247.DOI: 10.3778/j.issn.1002-8331.2311-0430

• 图形图像处理 • 上一篇    下一篇

多头注意力机制的全频图像去噪算法

江结林,史明月,杨海东,崔燕   

  1. 1.南京信息工程大学 软件学院,南京 210044
    2.江苏省先进计算与智能服务工程研究中心,南京 210044
    3.内蒙古工业大学 信息工程学院,呼和浩特 026208
    4.南京特殊教育师范学院 数学与信息科学学院,南京 210038
  • 出版日期:2024-08-15 发布日期:2024-08-15

Omni-Frequency Image Denoising with Multi-Head Attention

JIANG Jielin, SHI Mingyue, YANG Haidong, CUI Yan   

  1. 1.School of Software, Nanjing University of Information Science and Technology, Nanjing 210044, China
    2.Jiangsu Province Engineering Research Center of Advanced Computing and Intelligent Services, Nanjing 210044, China
    3.School of Information Engineering, Inner Mongolia University of Technology, Hohhot 026208, China
    4.College of Mathematics and Information Science, Nanjing Normal University of Special Education, Nanjing 210038, China
  • Online:2024-08-15 Published:2024-08-15

摘要: 近年来,深度卷积神经网络(convolutional neural network,CNN)在图像去噪领域取得了显著成果。然而,现有的大部分去噪方法都是将噪声图像直接输入CNN模型训练,依赖于裁剪大量的图像训练块,重复裁剪的区域不仅浪费计算资源,还限制了特征提取的多样性,导致图像纹理细节的丢失。为解决这些问题,提出一种适用于去除加性高斯白噪声和真实图像噪声的全频增强多头注意力去噪网络。该方法将噪声图像分解为低频和高频分量,并与噪声图像一起输入网络进行训练,通过增加网络宽度来提取更丰富的图像特征。特征增强多头注意力机制关注图像级别的特征,能够保留更多的纹理细节。为了得到干净噪声映射,还设计了噪声学习模块来去除冗余特征并优化图像的残差特征。在Set12、CBSD68等多个数据集上验证了所提出方法的有效性。实验结果显示,该方法在灰度噪声图像去噪、彩色噪声图像去噪以及真实图像去噪方面均优于ADNet、AMDNet、MWDCNN等主流去噪方法,而且使用该方法去噪后的图像具有更清晰的视觉效果。

关键词: 图像去噪, 高斯噪声, 多头注意力, 残差学习, 频率分解

Abstract: In recent years, deep convolutional neural networks (CNN) have achieved significant results in the field of image denoising. However, most existing denoising methods directly input noisy images into CNN models for training, relying on cropping a large number of image training blocks. The repeatedly cropped regions not only waste computing resources, but also limit the diversity of feature extraction, resulting in the loss of image texture details. To address these issues, this paper proposes an omni-frequency enhanced multi-head attention network (OEMANet) for removing additive Gaussian white noise and real-world noise. The noisy image is decomposed into low- and high-frequency components, these two components and the noisy image are then simultaneously input into OEMANet for training. By increasing the network width, richer image features are extracted. An enhanced multi-head attention mechanism focuses on features at the image level, and recovers more texture details. To obtain accurate noise mappings, a noise learning module is used to remove redundant features and optimize the remaining features of the image. In this paper, the effectiveness of OEMANet is verified on multiple datasets such as Set12 and CBSD68. The experimental results show that the method proposed in this paper is superior to mainstream denoising methods such as ADNet, AMDNet, MWDCNN in terms of grayscale noise image denoising, color noise image denoising, and real image denoising. Moreover, the image denoised by OEMANet has a clearer visual performance.

Key words: image denoising, Gaussian noise, multi-head attention, residual learning, frequency decomposition