基于属性分解融合的可控人脸图像合成算法

doi:10.3778/j.issn.1002-8331.2110-0223

摘要/Abstract

摘要： 在现实生活中，人脸图像受隐私或安全因素的限制难以直接采集，因此可以考虑采用图像生成方法。当使用生成对抗网络进行图像生成时，容易出现分辨率低、边缘模糊、身份信息特征丢失等问题。针对上述问题，提出了一种新的人脸特征生成模型：通过将关键信息作为独立编码嵌入隐式空间，再与全局特征进行融合插值实现对人脸关键特征的可控生成；引入改进的注意力模块，在生成过程中关注局部特征和全局特征的相关性；将色差损失和人脸分量损失联合引入整体损失函数中，负责约束像素颜色和人脸纹理特征。该算法可以在人脸局部区域生成自然真实的外观特征，保留原始身份信息，并生成平滑的面部轮廓。使用预处理后的CelebA数据集的实验表明，该算法在主观视觉效果上有显著提升，同时与现有方法相比在PSNR和SSIM上有稳定的提升。

关键词: 生成对抗网络, 人脸特征生成, 注意力机制

Abstract: In real life, it is difficult to directly collect face images due to privacy or security factors, so image generation methods can be considered. When using a generative adversarial network for image generation, the results are prone to problems such as low resolution, blurred edges, and loss of identity information features. In response to the above problems, this paper proposes a new face feature generation model：by embedding key information as an independent code into the implicit space, and fusing and interpolating with global features, achieve the controllable generation of key facial features; introduce the improved attention module, and pay attention to the correlation between local features and global features during the generation process; introduce the color difference loss and face component loss into the overall loss function, and responsible for constraining pixel color and facial texture features. The algorithm can generate natural and true appearance features in local areas of the face, retain the original identity information, and generate smooth facial contours. Experiments using the preprocessed CelebA dataset show that the algorithm has a significant improvement in subjective visual effects, and at the same time has a stable improvement in PSNR and SSIM compared with existing methods.

Key words: generative adversarial networks, face feature generation, attention mechanism

梁鸿, 陈秋实, 邵明文. 基于属性分解融合的可控人脸图像合成算法[J]. 计算机工程与应用, 2023, 59(4): 208-215.

LIANG Hong, CHEN Qiushi, SHAO Mingwen. Controllable Face Image Synthesis Algorithm Based on Attribute Decomposition and Fusion[J]. Computer Engineering and Applications, 2023, 59(4): 208-215.

参考文献

[1] GOODFELLO I J，POUGET-ABADIE J，MIRZA M，et al.Generative adversarial nets[C]//International Conference on Neural Information Processing Systems，2014：2672-2680.
[2] ZANFIR M，POPA A I，ZANFIR A，et al.Human appearance transfer[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition，2018：5391-5399.
[3] YANG S，AMBERT T，PAN Z，et al.Detailed garment recovery from a single-view image[J].arXiv：1608.01250，2016.
[4] PONS-MOLL G，PUJADES S，HU S，et al.Clothcap：seamless 4D clothing capture and retargeting[J].ACM Transactions on Graphics，2017，36（4）：73.
[5] XU Z J，CHEN H，ZHU S C，et al.A hierarchical compositional model for face representation and sketching[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2008，30（6）：955-969.
[6] YANG C Y，PAI T W，HSIAO Y S，et al.Face shape classification for 2D caricature generation[C]//Proceedings of Workshop on Consumer Electronics，New York，2002：1-8.
[7] WINNEM?LLER H，OLSEN S C，GOOCH B.Real-time video abstraction[J].ACM Transactions on Graphics，2006，25（3）：1221-1226.
[8] KYPRIANIDIS J E，COLLOMOSSE J，WANG T H，et al.State of the “art”：a taxonomy of artistic stylization techniques for images and video[J].IEEE Transactions on Visualization and Computer Graphics，2013，19（5）：866-885.
[9] PAPARI G，PETKOV N，CAMPISI P.Artistic edge and coner enhancing smoothing[J].IEEE Transactions on Image Processing，2007，16（10）：2449-2462.
[10] KINGMA D P，WELLING M.Auto-encoding variational Bayes[J].arXiv：1312.6114，2013.
[11] LARSEN A B L，SONDERBY S K，WINTHER O.Autoencoding beyond pixels using a learned similarity metric[C]//Proceedings of the 33rd International Conference on Machine Learning，2016：1558-1566.
[12] DENTON E L，CHINTALA S，FERGUS R.Deep generative image models using a Laplacian pyramid of adversarial networks[C]//Advances in Neural Information Processing Systems，2015：1486-1494.
[13] RADFORD A，METZ L，CHINTALA S.Unsupervised representation learning with deep convolutional generative adversarial networks[J].arXiv：1511.06434，2015.
[14] 孙全，曾晓勤.基于生成对抗网络的图像修复[J].计算机科学，2018，45（12）：229-234.
SUN Q，ZENG X Q.Image inpainting based on generative adversarial networks[J].Computer Science，2018，45（12）：229-234.
[15] 程显毅，谢璐，朱建新，等.生成对抗网络GAN综述[J].计算机科学，2019，46（3）：74-81.
CHENG X Y，XIE L，ZHU J X，et al.Review of generative adversarial network[J].Computer Science，2019，46（3）：74-81.
[16] 徐强，钟尚平，陈开志，等.不同纹理复杂度图像生成中CycleGAN循环一致损失系数优化选择方法[J].计算机科学，2019，46（1）：100-106.
XU Q，ZHONG S P，CHEN K Z，et al.Optimized selection method of cycle-consistent loss coefficient of Cycle-GAN in image generation with different texture complexity[J].Computer Science，2019，46（1）：100-106.
[17] DARAS G，ODENA A.Your local GAN：designing two dimensional local attention mechanisms for generative models[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：14531-14539.
[18] KARNEWAR A，WANG O.MSG-GAN：multi-scale gradients for generative adversarial networks[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：7799-7808.
[19] LIU R，GE Y X.DivCo：diverse conditional image synthesis via contrastive generative adversarial network[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2021：16377-16386.
[20] ISOLA P，ZHU J Y，ZHOU T H，et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition，2017：1125-1134.
[21] MIRZA M，OSINDERO S.Conditional generative adversarial nets[J].arXiv：1411.1784，2014.
[22] WANG T C，LIU M Y，ZHU J Y，et al.High-resolution image synthesis and semantic manipulation with conditional GANs[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition，2018：8798-8807.
[23] KARRAS T，LAINE S，AILA T.A style-based generator architecture for generative adversarial networks[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition，2019：4401-4410.
[24] HUANG X，BELONGIE S.Arbitrary style transfer in real-time with adaptive instance normalization[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision，2017：1501-1510.
[25] ZHU J Y，PARK T，ISOLA P，et al.Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision，2017：2223-2232.
[26] MA L Q，JIA X，SUN Q R，et al.Pose guided person image generation[C]//Advances in Neural Information Processing Systems，2017：406-416.
[27] HUYNH-THU Q，GHANBARI M.Scope of validity of PSNR in image/video quality assessment[J].Electronics Letters，2008，44（13）：800-801.
[28] WANG Z.The SSIM index for image quality assessment[EB/OL].[2021-09-20].https：//ece.uwaterloo.ca/z70wang/research/ssim.
[29] HORE A，DIEMEL Z.Image quality metrics：PSNR vs.SSIM[C]//2010 20th International Conference on Pattern Recognition，2010.