计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (9): 255-262.DOI: 10.3778/j.issn.1002-8331.2312-0107

• 图形图像处理 • 上一篇    下一篇

VTON-FG:通过图像边缘轮廓特征引导的虚拟试衣网络

谭台哲,陈宏才,杨卓   

  1. 1.广东工业大学 计算机学院,广州 510006
    2.河源市湾区数字经济技术发展有限公司,广东 河源 517400
  • 出版日期:2025-05-01 发布日期:2025-04-30

VTON-FG: Virtual Try-on Network Guided by Image Edge Contour Features

TAN Taizhe, CHEN Hongcai, YANG Zhuo   

  1. 1.School of Computer, Guangdong University of Technology, Guangzhou 510006, China
    2.Heyuan Bay Area Digital Economy Technology Development Co., Ltd., Heyuan, Guangdong 517400, China
  • Online:2025-05-01 Published:2025-04-30

摘要: 针对当前虚拟试衣方法在生成手臂和服装纹理细节方面普遍存在生成模糊和不完整的问题,基于CP-VTON+模型提出了一种名为VTON-FG的虚拟试衣网络。该网络引入了特征引导模块,在试穿模块中,通过提取服装边缘轮廓特征,并将不同尺度的特征信息融合到U-Net编码器的不同层次中,从而避免手臂信息和纹理信息丢失,引导图像生成更清晰的手臂图像和纹理特征。此外,在几何匹配模块中,为解决服装变形后可能导致的纹理畸变问题,引入了边缘轮廓图损失,从而提高了服装变形后的真实性。通过一系列消融实验,逐步验证了各模块在模型性能提升中的关键作用。实验结果表明,这些针对性的改进增强了模型的性能,从而证实了各模块在整体模型架构中的有效性和重要性。相对于CP-VTON+,该方法在结构相似度SSIM、感知相似度LPIPS和IS评价指标上分别提升了2.8%、19.4%和6.9%。

关键词: 虚拟试衣, 可控图像生成, 特征引导, 坐标注意力, 边缘检测

Abstract: In response to the prevalent issues of generating blurry and incomplete details in virtual garment try-on methods concerning the generation of arm and clothing texture details, a novel virtual try-on network named VTON-FG is proposed, based on the CP-VTON+ model. This network introduces a feature guided module. In the try-on module, it extracts clothing edge contour features and integrates features of different scales into various levels of the U-Net encoder. This approach aims to prevent the loss of arm and texture information, thereby guiding the generation of clearer arm images and texture features. Additionally, a geometric matching module is incorporated to address potential texture distortions arising from clothing deformation, introducing an edge contour loss to enhance the authenticity of clothing deformation. A series of ablation experiments are conducted to gradually verify the crucial role of each module in improving model performance. The experimental results indicate that these targeted improvements have enhanced the performance of model, thus confirming the effectiveness and importance of each module within the overall model architecture. Relative to CP-VTON+, this method exhibits enhancements of 2.8%, 19.4%, and 6.9% in structural similarity (SSIM), perceptual similarity (LPIPS), and IS evaluation metrics, respectively.

Key words: virtual try-on, controllable image generation, feature guidance, coordinate attention, edge detection