计算机工程与应用 ›› 2026, Vol. 62 ›› Issue (8): 298-307.DOI: 10.3778/j.issn.1002-8331.2503-0005

• 图形图像处理 • 上一篇    下一篇

融合频域信息的双目图像超分辨重建Mamba网络

马鸣声1,2,张德1,2+   

  1. 1.北京建筑大学 智能科学与技术学院,北京 102616
    2.北京建筑大学 城市建筑超级智能技术北京市重点实验室,北京 102616
    + 通信作者 E-mail:zhangde@bucea.edu.cn
  • 收稿日期:2025-03-03 修回日期:2025-06-10 在线发布日期:2026-04-15 出版日期:2026-04-15
  • 基金资助:
    国家自然科学基金(62271035);北京市自然科学基金(4232021)。

Frequency-Assisted Mamba Network for Stereo Image Super-Resolution

MA Mingsheng1,2, ZHANG De1,2+   

  1. 1.School of Intelligence Science and Technology, Beijing University of Civil Engineering and Architecture, Beijing 102616, China
    2.Beijing Key Laboratory of Super Intelligent Technology for Urban Architecture, Beijing University of Civil Engineering and Architecture, Beijing 102616, China
    + Corresponding author E-mail:zhangde@bucea.edu.cn
  • Received:2025-03-03 Revised:2025-06-10 Online:2026-04-15 Published:2026-04-15

摘要: 双目图像超分辨(stereo image super-resolution,SSR)旨在结合双目相机两个不同视角的图像信息,生成具有更高分辨率和更丰富细节的图像。当前,基于Transformer的SSR方法通过自注意力机制可以捕获全局的依赖关系,并获取广泛的上下文信息,但其计算复杂度随序列长度的增加呈平方级增长,运行效率不高。因此,提出基于Mamba架构的双目图像超分辨重建网络模型(MambaSSR)。Mamba架构具有线性计算复杂度,可以明显提高算法的运行效率,并具备和Transformer同样强大的功能。另外,在提出的MambaSSR模型中还设计了频域注意力模块,以保留关键的频率信息,从而更准确地重建图像的纹理细节。在多个公开数据集上进行了实验,结果表明,与主流方法相比,所提Mamba模型能够以更高的运行效率实现优越的超分辨重建效果。

关键词: 双目图像, 图像超分辨重建, Mamba, 频域

Abstract: Stereo image super-resolution (SSR) aims to exploit image information from two different perspectives of the binocular camera to generate images with higher resolution and richer details. Currently, Transformer-based SSR methods can capture global dependencies and obtain extensive contextual information through self-attention mechanisms. However, the computational complexity gets a quadratic increase with the sequence length, resulting in a low efficiency. Therefore, a new stereo image super-resolution network model based on Mamba architecture (MambaSSR) is proposed. The Mamba architecture has linear computational complexity, which can significantly improve the running efficiency of algorithms, and has the same powerful functions as Transformers. In addition, a frequency domain attention module is designed in the proposed MambaSSR model to preserve key frequency information. Thereby, more accurate texture details are reconstructed. Experiments are conducted on several public datasets, and the experimental results show that compared with mainstream methods, the proposed Mamba model can achieve excellent super-resolution reconstruction results with higher running efficiency.

Key words: stereo image, image super-resolution, Mamba, frequency domain