[1] 王宇昊, 何彧, 王铸. 基于深度学习的文本到图像生成方法综述[J]. 计算机工程与应用, 2022, 58(10): 50-67.
WANG Y H, HE Y, WANG Z. Overview of text-to-image generation methods based on deep learning[J]. Computer Engineering and Applications, 2022, 58(10): 50-67.
[2] LU Y, SHU Z, LI Y, et al. Content-aware GAN compression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 12156-12166.
[3] KUMAR P M R, JAYAGOPAL P. Generative adversarial networks: a survey on applications and challenges[J]. International Journal of Multimedia Information Retrieval, 2021, 10(1): 1-24.
[4] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems, 2014: 2672-2680.
[5] 吴福祥, 程俊. 基于自编码器生成对抗网络的可配置文本图像编辑[J]. 软件学报, 2022, 33(9): 3139-3151.
WU F X, CHENG J. Configurable text-based image editing by autoencoder-based generative adversarial networks[J]. Journal of Software, 2022, 33(9): 3139-3151.
[6] RUAN S, ZHANG Y, ZHANG K, et al. DAE-GAN: dynamic aspect-aware GAN for text-to-image synthesis[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 13960-13969.
[7] 黄晓琪, 王莉, 李钢. 融合胶囊网络的文本-图像生成对抗模型[J]. 计算机工程与应用, 2021, 57(14): 176-180.
HUANG X Q, WANG L, LI G. Text-image generative adversarial model for fusion capsule networks[J]. Computer Engineering and Applications, 2021, 57(14): 176-180.
[8] 陈积泽, 姜晓燕, 高永彬. 基于门机制注意力模型的文本生成图像方法[J]. 计算机工程与应用, 2023, 59(12): 208-216.
CHEN J Z, JIANG X Y, GAO Y B. Text-to-image method based on attention model with increased gate mechanism[J]. Computer Engineering and Applications, 2023, 59(12): 208-216.
[9] DASH A, GAMBOA J C B, AHMED S, et al. TAC-GAN-text conditioned auxiliary classifier generative adversarial network[J]. arXiv:1703.06412, 2017.
[10] XIA W, YANG Y, XUE J H, et al. TediGAN: text-guided diverse face image generation and manipulation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 2256-2265.
[11] 鞠思博, 徐晶, 李岩芳. 基于自注意力机制的文本生成单目标图像方法[J]. 计算机工程与应用, 2022, 58(3): 249-258.
JU S B, XU J, LI Y F. Text-to-single image method based on self-attention[J]. Computer Engineering and Applications, 2022, 58(3): 249-258.
[12] YE H, YANG X, TAKAC M, et al. Improving text-to-image synthesis using contrastive learning[J]. arXiv:2107.02423, 2021.
[13] 徐泽, 帅仁俊, 刘开凯, 等. 基于特征融合的文本到图像的生成[J]. 计算机科学, 2021, 48(6): 125-130.
XU Z, SHUAI R J, LIU K K, et al. Generation of realistic image from text based on feature fusion[J]. Computer Science, 2021, 48(6): 125-130.
[14] LI B, QI X, LUKASIEWICZ T, et al. ManiGAN: text-guided image manipulation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 7880-7889.
[15] ZHANG H, XU T, LI H, et al. StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 5907-5915.
[16] ZHANG H, XU T, LI H, et al. StackGAN++: realistic image synthesis with stacked generative adversarial networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(8): 1947-1962.
[17] XU T, ZHANG P, HUANG Q, et al. AttnGAN: fine-grained text to image generation with attentional generative adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 1316-1324.
[18] ZHU M, PAN P, CHEN W, et al. DM-GAN: dynamic memory generative adversarial networks for text-to-image synthesis[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 5802-5810.
[19] LI B, QI X, LUKASIEWICZ T, et al. Controllable text-to-image generation[J]. arXiv:1909.07083, 2019.
[20] REED S, AKATA Z, YAN X, et al. Generative adversarial text to image synthesis[C]//Proceedings of the International Conference on Machine Learning, 2016: 1060-1069.
[21] 谈馨悦, 何小海, 王正勇, 等. 基于Transformer交叉注意力的文本生成图像技术[J]. 计算机科学, 2022, 49(2): 107-115.
TAN X Y, HE X H, WANG Z Y, et al. Text-to-image generation technology based on Transformer cross attention[J]. Computer Science, 2022, 49(2): 107-115.
[22] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 6000-6010.
[23] RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[C]//Proceedings of the International Conference on Machine Learning, 2021: 8748-8763.
[24] SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681.
[25] ZHONG Z, ZHENG L, KANG G, et al. Random erasing data augmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020: 13001-13008.
[26] WAH C, BRANSON S, WELINDER P, et al. The Caltech-UCSD birds-200-2011 dataset[R]. California: California lnstitue of Technology, 2011.
[27] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]//Proceedings of the 13th European Conference on Computer Vision, 2014: 740-755.
[28] HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. GANs trained by a two time-scale update rule converge to a local nash equilibrium[J]. arXiv:1706.08500, 2017.
[29] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2818-2826.
[30] SALIMANS T, GOODFELLOW I, ZAREMBA W, et al. Improved techniques for training GANs[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016: 2234-2242.
[31] VAN ERVEN T, HARREMOS P. Rényi divergence and Kullback-Leibler divergence[J]. IEEE Transactions on Information Theory, 2014, 60(7): 3797-3820.
[32] KINGMA D P, BA J. Adam: a method for stochastic optimization[J]. arXiv:1412.6980, 2014. |