计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (12): 258-266.DOI: 10.3778/j.issn.1002-8331.2403-0222

• 图形图像处理 • 上一篇    下一篇

从放缩到重构的Swin Transformer图像矩形化网络

杨利春,党建武,王梦思,张天胤,田彬   

  1. 1.兰州交通大学 光电技术与智能控制教育部重点实验室,兰州 730070 
    2.兰州交通大学 电子与信息工程学院,兰州 730070
  • 出版日期:2025-06-15 发布日期:2025-06-13

Swin Transformer Image Rectangling Network from Scaling to Reconstruction

YANG Lichun, DANG Jianwu, WANG Mengsi, ZHANG Tianyin, TIAN Bin   

  1. 1.Key Laboratory of Optoelectronic Technology and Intelligent Control, Ministry of Education, Lanzhou Jiaotong University, Lanzhou 730070, China
    2.School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
  • Online:2025-06-15 Published:2025-06-13

摘要: 图像矩形化旨在解决拼接后的图像所存在的边界不规则问题。现有的矩形化方法通过多步扭曲处理来实现图像的矩形化处理。但是这些方法仍然存在一定的内容失真及边界破损等问题。为了解决这些问题,提出了一种单步扭曲处理的图像矩形化解决方案(IRFormer)。具体来说,结合了尺度放缩策略,构建了一个基于Swin Transformer架构的低分辨率单步扭曲分支;结合轻量化策略,构建了一个高分辨率重建及边界修复的分支。通过广泛的实验,验证了IRFormer在多种场景中均具有良好的矩形化表现,具有较高的内容保真性和边界完整性。在定性和定量比较中,IRFormer均展现出了最先进的矩形化性能。

关键词: 图像矩形化, 单级网格预测, 尺度放缩, Swin Transformer, 超分辨率重建

Abstract: Image rectangling aims to resolve the boundary irregularities present in the stitched images. Existing deskew methods achieve deskew of images by multi-step distortion processing. However, these methods still have certain problems such as content distortion and boundary breakage. To solve these problems, a single-step distortion processing solution for image rectangling (IRFormer) is proposed. Specifically, a low-resolution single-step warping branch based on the Swin Transformer architecture is constructed by combining the scale deflation strategy, and a high-resolution reconstruction and boundary restoration branch is constructed by combining the lightweighting strategy. Through extensive experiments, it is verified that IRFormer has good rectangling performance with high content fidelity and boundary integrity in a wide range of scenes. IRFormer demonstrates state-of-the-art rectangularisation performance in both qualitative and quantitative comparisons.

Key words: image rectangling, single-stage mesh prediction, scale scaling, Swin Transformer, super-resolution reconstruction