计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (23): 186-196.DOI: 10.3778/j.issn.1002-8331.2105-0278

• 模式识别与人工智能 • 上一篇    下一篇

FP-VTON:基于注意力机制的特征保持虚拟试衣网络

谭泽霖,白静,陈冉,张少敏,秦飞巍   

  1. 1.北方民族大学 计算机科学与工程学院,银川 750021
    2.国家民委图像图形智能处理实验室,银川 750021
    3.杭州电子科技大学 计算机学院,杭州 310018
  • 出版日期:2022-12-01 发布日期:2022-12-01

FP-VTON: Attention-Based Feature Preserving Virtual Try-on Network

TAN Zelin, BAI Jing, CHEN Ran, ZHANG Shaomin, QIN Feiwei   

  1. 1.School of Computer Science and Engineering, North Minzu University, Yinchuan 750021, China
    2.The Key Laboratory of Images & Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan 750021, China
    3.Computer and Software School, Hangzhou Dianzi University, Hangzhou 310018, China
  • Online:2022-12-01 Published:2022-12-01

摘要: 随着互联网经济和人工智能技术的飞速发展,越来越多的消费者选择在网上购买衣服,虚拟试衣技术可以为消费者提供方便、快捷的试衣服务,为消费者提供更好的网上购物体验。当前,基于二维图像的虚拟试衣方法可以摒弃三维虚拟试衣所需昂贵的硬件成本和时间代价,但是仍然存在无法有效适应模特的不同体型及大姿态动作的问题,无法充分保留目标服装复杂纹理特征和局部细节特征的问题。为此,提出一种基于注意力机制的特征保持虚拟试衣网络FP-VTON,通过服装变形和服装融合两阶段网络生成虚拟试穿结果。针对传统卷积难以适应非刚性物体大尺寸变形的问题在两阶段网络中引入了捕捉全局特征的特征注意力机制,针对TPS变换翘曲严重的问题提出了服装保真损失函数对网格上点间的距离和斜率进行约束。通过与相关工作的定量和可视化定性实验对比,充分验证了FP-VTON在大姿态形变、复杂纹理服装和特殊体型的情况下可以生成更加逼真的图像,更加有效地保留服装的复杂纹理细节和用户的身份信息。

关键词: 深度学习, 虚拟试衣, 非刚性变换, 注意力机制, 薄板样条变换

Abstract: With the rapid development of Internet economy and artificial intelligence technology, more and more consumers choose to buy clothes online. Virtual try-on can provide convenient and fast fitting services and better online shopping experience for consumers. Currently, the 2D images-based virtual try-on methods can abandon the expensive hardware cost and time cost of 3D virtual try-on methods, but they still cannot effectively adapt to the different body shapes and large scale postures of models, and cannot fully retain the complex texture features and local details of the target clothing. It proposes a feature preserving virtual try-on network FP-VTON based on attention mechanism, which consists of two-stage network of clothing deformation and clothing fusion. Aiming at the problem that traditional convolution cannot adapt to the large size deformation of non-rigid objects, a non-local feature attention model is introduced into the two-stage network. In addition, aiming at the serious warpage problem of TPS transformation, the clothing fidelity loss function is proposed to constrain the distance and slope between the points on the grids. Through the quantitative and visual qualitative experiments compared with state-of-the-art methods, it is demonstrated that FP-VTON can generate more realistic images in the case of large posture deformations, complex texture clothing and special body shapes, and retain the complex texture details of clothing and user’s identity information more effectively.

Key words: deep learning, virtual try-on, non-rigid transformation, attention, TPS transformation