Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (19): 21-39.DOI: 10.3778/j.issn.1002-8331.2211-0392
• Research Hotspots and Reviews • Previous Articles Next Articles
LAI Li’na, MI Yu, ZHOU Longlong, RAO Jiyong, XU Tianyang, SONG Xiaoning
Online:
2023-10-01
Published:
2023-10-01
赖丽娜,米瑜,周龙龙,饶季勇,徐天阳,宋晓宁
LAI Li’na, MI Yu, ZHOU Longlong, RAO Jiyong, XU Tianyang, SONG Xiaoning. Survey About Generative Adversarial Network and Text-to-Image Synthesis[J]. Computer Engineering and Applications, 2023, 59(19): 21-39.
赖丽娜, 米瑜, 周龙龙, 饶季勇, 徐天阳, 宋晓宁. 生成对抗网络与文本图像生成方法综述[J]. 计算机工程与应用, 2023, 59(19): 21-39.
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2211-0392
[1] GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial networks[J].Communications of the ACM,2020,63(11):139-144. [2] RAMESH A,PAVLOV M,GOH G,et al.Zero-shot text-to-image generation[C]//Proceedings of the International Conference on Machine Learning,2021:8821-8831. [3] 36 T,AILA T,LAINE S,et al.Progressive growing of GANs for improved quality,stability,and variation[J].arXiv:1710.10196,2017. [4] BERMANO A H,GAL R,ALALUF Y,et al.State‐of‐the‐art in the architecture,methods,and applications of StyleGAN[J].Computer Graphics Forum,2022,41(2):591-611. [5] NGUYEN T,LE T,VU H,et al.Dual discriminator generative adversarial nets[C]//Advances in Neural Information Processing Systems,2017. [6] RADFORD A,METZ L,CHINTALA S.Unsupervised representation learning with deep convolutional generative adversarial networks[J].arXiv:1511.06434,2015. [7] ARJOVSKY M,BOTTOU L.Towards principled methods for training generative adversarial networks[J].arXiv:1701.04862,2017. [8] GULRAJANI I,AHMED F,ARJOVSKY M,et al.Improved training of wasserstein GANs[C]//Advances in Neural Information Processing Systems,2017,30. [9] MIRZA M,OSINDERO S.Conditional generative adversarial nets[J].arXiv:1411.1784,2014. [10] 魏富强,古兰拜尔·吐尔洪,买日旦·吾守尔.生成对抗网络及其应用研究综述[J].计算机工程与应用,2021,57(19):18-31. WEI F Q,TUERHONG G,WUSHOUER M.Review of research on generative adversarial networks and its application[J].Computer Engineering and Applications,2021,57(19):18-31. [11] ZHANG H,GOODFELLOW I,METAXAS D,et al.Self-attention generative adversarial networks[C]//Proceedings of the International Conference on Machine Learning,2019:7354-7363. [12] JING Y,YANG Y,FENG Z,et al.Neural style transfer:a review[J].IEEE Transactions on Visualization and Computer Graphics,2019,26(11):3365-3385. [13] KINGMA D P,WELLING M.Auto-encoding variational bayes[J].arXiv:1312.6114,2013. [14] REZENDE D J,MOHAMED S,WIERSTRA D.Stochastic backpropagation and approximate inference in deep generative models[C]//Proceedings of the International Conference on Machine Learning,2014:1278-1286. [15] 李西明,吴嘉润,吴少乾.敌手能力有限时基于生成对抗网络的保密增强[J].计算机科学与探索,2021,15(7):1220-1226. LI X M,WU J R,WU S Q.GANs based privacy amplification against bounded adversaries[J].Journal of Frontiers of Computer Science and Technology,2021,15(7):1220-1226. [16] LEDIG C,THEIS L,HUSZáR F,et al.Photo-realistic single image super-resolution using a generative adversarial network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:4681-4690. [17] ANDREINI P,BONECHI S,BIANCHINI M,et al.Image generation by GAN and style transfer for agar plate image segmentation[J].Computer Methods and Programs in Biomedicine,2020,184:105268. [18] ZHANG H,KOH J Y,BALDRIDGE J,et al.Cross-modal contrastive learning for text-to-image generation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:833-842. [19] TAN H,LIU X,YIN B,et al.Cross-modal semantic matching generative adversarial networks for text-to-image synthesis[J].IEEE Transactions on Multimedia,2021,24:832-845. [20] QI Z,FAN C,XU L,et al.MRP-GAN:multi-resolution parallel generative adversarial networks for text-to-image synthesis[J].Pattern Recognition Letters,2021,147:1-7. [21] PENNINGTON J,SOCHER R,MANNING C D.Glove:global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP),2014:1532-1543. [22] LE Q,MIKOLOV T.Distributed representations of sentences and documents[C]//Proceedings of the International Conference on Machine Learning,2014:1188-1196. [23] JOULIN A,GRAVE E,BOJANOWSKI P,et al.Bag of tricks for efficient text classification[J].arXiv:1607.01759,2016. [24] 夏鸿斌,肖奕飞,刘渊.融合自注意力机制的长文本生成对抗网络模型[J].计算机科学与探索,2022,16(7):1603-1610. XIA H B,XIAO Y F,LIU Y.Long text generation adversarial network model with self-attention mechanism[J].Journal of Frontiers of Computer Science and Technology,2022,16(7):1603-1610. [25] MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013. [26] RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[J].OpenAI Blog,2019,1(8):9. [27] DEVLIN J,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [28] LAN Z,CHEN M,GOODMAN S,et al.Albert:a lite bert for self-supervised learning of language representations[J].arXiv:1909.11942,2019. [29] LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:3431-3440. [30] 魏忠钰,范智昊,王瑞泽,等.从视觉到文本:图像描述生成的研究进展综述[J].中文信息学报,2020,34(7):19-29. WEI Z Y,FAN Z H,WANG R Z,et al.From vision to text:a brief survey for image captioning[J].Journal of Chinese Information Processing,2020,34(7):19-29. [31] DUMOULIN V,BELGHAZI I,POOLE B,et al.Adversarially learned inference[J].arXiv:1606.00704,2016. [32] AGNESE J,HERRERA J,TAO H,et al.A survey and taxonomy of adversarial neural networks for text‐to‐image synthesis[J].Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery,2020,10(4):e1345. [33] ZHAO L,ZHANG Z,CHEN T,et al.Improved transformer for high-resolution gans[C]//Advances in Neural Information Processing Systems,2021:18367-18380. [34] ARJOVSKY M,CHINTALA S,BOTTOU L.Wasserstein generative adversarial networks[C]//Proceedings of the International Conference on Machine Learning,2017:214-223. [35] IOFFE S,SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the International Conference on Machine Learning,2015:448-456. [36] KARRAS T,LAINE S,AILA T.A style-based generator architecture for generative adversarial networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:4401-4410. [37] GAN-QP J S.A novel GAN framework without gradient vanishing and lipschitz constraint[J].arXiv:1811.07296,2018. [38] ZHANG Z,LI M,YU J.D2PGGAN:two discriminators used in progressive growing of GANs[C]//Proceedings of the 2019 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2019:3177-3181. [39] RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252. [40] 申瑞彩,翟俊海,侯璎真.选择性集成学习多判别器生成对抗网络[J].计算机科学与探索,2022,16(6):1429-1438. SHEN R C,ZHAI J H,HOU Y Z.Multi-discriminator generative adversarial networks based on selective ensemble learning[J].Journal of Frontiers of Computer Science and Technology,2022,16(6):1429-1438. [41] KARRAS T,LAINE S,AITTALA M,et al.Analyzing and improving the image quality of stylegan[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:8110-8119. [42] 胡名起.基于生成对抗网络的文本生成图像研究[D].南京:东南大学,2020. HU M Q.Research on generated image based on generative pair network[D].Nanjing:Southeast University,2020. [43] REED S,AKATA Z,YAN X,et al.Generative adversarial text to image synthesis[C]//Proceedings of the International Conference on Machine Learning,2016:1060-1069. [44] TAO M,TANG H,WU F,et al.DF-GAN:a simple and effective baseline for text-to-image synthesis[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:16515-16525. [45] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778. [46] WAH C,BRANSON S,WELINDER P,et al.The Caltech-UCSD birds-200-2011 dataset[D].California Institute of Technology,2011:1-8. [47] LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//Proceedings of the European Conference on Computer Vision,2014:740-755. [48] ZHANG Z,SCHOMAKER L.DiverGAN:an efficient and effective single-stage framework for diverse text-to-image generation[J].Neurocomputing,2022,473:182-198. [49] LIAO W,HU K,YANG M Y,et al.Text to image generation with semantic-spatial aware GAN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:18187-18196. [50] WU X,ZHAO H,ZHENG L,et al.Adma-GAN:attribute-driven memory augmented GANs for text-to-image generation[C]//Proceedings of the 30th ACM International Conference on Multimedia,2022:1593-1602. [51] HUANG M,MAO Z,WANG P,et al.DSE-GAN:dynamic semantic evolution generative adversarial network for text-to-image generation[C]//Proceedings of the 30th ACM International Conference on Multimedia,2022:4345-4354. [52] QIAO T,ZHANG J,XU D,et al.Mirrorgan:learning text-to-image generation by redescription[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:1505-1514. [53] ZHANG H,XU T,LI H,et al.StackGAN:text to photo-realistic image synthesis with stacked generative adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:5907-5915. [54] XU T,ZHANG P,HUANG Q,et al.AttnGAN:fine-grained text to image generation with attentional generative adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:1316-1324. [55] ZHU M,PAN P,CHEN W,et al.DM-GAN:dynamic memory generative adversarial networks for text-to-image synthesis[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:5802-5810. [56] FENG F,NIU T,LI R,et al.Modality disentangled discriminator for text-to-image synthesis[J].IEEE Transactions on Multimedia,2021,24:2112-2124. [57] LEE M,SEOK J.Controllable generative adversarial network[J].IEEE Access,2019,7:28158-28169. [58] TAN H,LIU X,LI X,et al.Semantics-enhanced adversarial nets for text-to-image synthesis[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2019:10501-10510. [59] BERTHELOT D,SCHUMM T,METZ L.BeGAN:boundary equilibrium generative adversarial networks[J].arXiv:1703.10717,2017. [60] CHENG J,WU F,TIAN Y,et al.RiFeGAN:rich feature generation for text-to-image synthesis from prior knowledge[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:10911-10920. [61] XIA W,YANG Y,XUE J H,et al.TediGAN:text-guided diverse face image generation and manipulation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:2256-2265. [62] RUAN S,ZHANG Y,ZHANG K,et al.DAE-GAN:dynamic aspect-aware GAN for text-to-image synthesis[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2021:13960-13969. [63] WANG H,LIN G,HOI S C H,et al.Cycle-consistent inverse GAN for text-to-image synthesis[C]//Proceedings of the 29th ACM International Conference on Multimedia,2021:630-638. [64] PENG J,ZHOU Y,SUN X,et al.Knowledge-driven generative adversarial network for text-to-image synthesis[C]//Proceedings of ICML 2016,2016. [65] YANG Y,WANG L,XIE D,et al.Multi-sentence auxiliary adversarial networks for fine-grained text-to-image synthesis[J].IEEE Transactions on Image Processing,2021,30:2798-2809. [66] HINZ T,HEINRICH S,WERMTER S.Semantic object accuracy for generative text-to-image synthesis[J].arXiv:1910.13321,2019. [67] CHEN Z,MAO Z,FANG S,et al.Background layout generation and object knowledge transfer for text-to-image generation[C]//Proceedings of the 30th ACM International Conference on Multimedia,2022:4327-4335. [68] FANG F,LI Z,LUO F,et al.Discriminator modification in GAN for text-to-image generation[C]//Proceedings of the 2022 IEEE International Conference on Multimedia and Expo,2022:1-6. [69] YANG B,FENG F,WANG X.GR-GAN:gradual refinement text-to-image generation[J].arXiv:2205.11273,2022. [70] FANG F,LI Z,LUO F,et al.PhraseGAN:phrase-boost generative adversarial network for text-to-image generation[C]//Proceedings of the IEEE International Conference on Multimedia and Expo(ICME),2022. [71] BENGIO Y,MESNIL G,DAUPHIN Y,et al.Better mixing via deep representations[C]//Proceedings of the International Conference on Machine Learning,2013:552-560. [72] NILSBACK M E,ZISSERMAN A.Automated flower classification over a large number of classes[C]//Proceedings of the 6th Indian Conference on Computer Vision,Graphics & Image Processing,2008:722-729. [73] ZHANG Z,ZHOU J,YU W,et al.Text-to-image synthesis:starting composite from the foreground content[J].Information Sciences,2022,607:1265-1285. [74] HINZ T,HEINRICH S,WERMTER S.Generating multiple objects at spatially distinct locations[J].arXiv:1901. 00686,2019. [75] WU F,LIU L,HAO F,et al.Text-to-image synthesis based on object-guided joint-decoding transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:18113-18122. [76] GURUMURTHY S,KIRAN SARVADEVABHATLA R,VENKATESH BABU R.DeliGAN:generative adversarial networks for diverse and limited data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:166-174. [77] TAN Y X,LEE C P,NEO M,et al.Text-to-image synthesis with self-supervised learning[J].Pattern Recognition Letters,2022,157:119-126. [78] QUAN F,LANG B,LIU Y.ARRPNGAN:text-to-image GAN with attention regularization and region proposal networks[J].Signal Processing:Image Communication,2022,106:116728. [79] HUANG S,CHEN Y.Generative adversarial networks with adaptive semantic normalization for text-to-image synthesis[J].Digital Signal Processing,2022,120:103267. [80] MA Y,LIU L,ZHANG H,et al.Generative adversarial network based on semantic consistency for text-to-image generation[J].Applied Intelligence,2023,53(4):4703-4716. [81] SHI Z,CHEN Z,XU Z,et al.AtHom:two divergent attentions stimulated by homomorphic training in text-to-image synthesis[C]//Proceedings of the 30th ACM International Conference on Multimedia,2022:2211-2219. [82] CHENG J,WU F,TIAN Y,et al.RiFeGAN2:rich feature generation for text-to-image synthesis from constrained prior knowledge[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,32(8):5187-5200. [83] LI B,TORR P H S,LUKASIEWICZ T.Memory-driven text-to-image generation[J].arXiv:2208.07022,2022. [84] LI Z,MIN M R,LI K,et al.Stylet2i:toward compositional and high-fidelity text-to-image synthesis[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:18197-18207. [85] 王威,李玉洁,郭富林,等.生成对抗网络及其文本图像合成综述[J].计算机工程与应用,2022,58(19):14-36. WANG W,LI Y J,GUO F L,et al.Survey about generative adversarial network based text-to-image synthesis[J].Computer Engineering and Applications,2022,58(19):14-36. [86] REED S,AKATA Z,LEE H,et al.Learning deep representations of fine-grained visual descriptions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:49-58. [87] FROLOV S,HINZ T,RAUE F,et al.Adversarial text-to-image synthesis:a review[J].Neural Networks,2021,144:187-209. [88] HEUSEL M,RAMSAUER H,UNTERTHINER T,et al.GANs trained by a two time-scale update rule converge to a local nash equilibrium[C]//Advances in Neural Information Processing Systems,2017. [89] SALIMANS T,GOODFELLOW I,ZAREMBA W,et al.Improved techniques for training GANs[C]//Advances in Neural Information Processing Systems,2016. [90] LI W,WEN S,SHI K,et al.Neural architecture search with a lightweight transformer for text-to-image synthesis[J].IEEE Transactions on Network Science and Engineering,2022,9(3):1567-1576. [91] ZHANG Z,SCHOMAKER L.Optimized latent-code selection for explainable conditional text-to-image GANs[C]//Proceedings of the International Joint Conference on Neural Networks(IJCNN),2022:1-9. [92] ZHANG H,YANG S,ZHU H.CJE-TIG:zero-shot cross-lingual text-to-image generation by Corpora-based Joint Encoding[J].Knowledge-Based Systems,2022,239:108006. [93] DONAHUE J,KR?HENBüHL P,DARRELL T.Adversarial feature learning[J].arXiv:1605.09782,2016. [94] CHOI Y,CHOI M,KIM M,et al.StarGAN:Unified generative adversarial networks for multi-domain image-to-image translation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:8789-8797. |
[1] | CHEN Jishang, Abudukelimu Halidanmu, LIANG Yunze, Abulizi Abudukelimu, Aishan Mikelayi, GUO Wenqiang. Review of Application of Deep Learning in Symbolic Music Generation [J]. Computer Engineering and Applications, 2023, 59(9): 27-45. |
[2] | JIANG Qiuxiang, GUO Weipeng, WANG Zilong, OUYANG Xingtao, LONG Ruirui. Application and Prospect of Python Language in Field of Hydrology and Water Resources [J]. Computer Engineering and Applications, 2023, 59(9): 46-58. |
[3] | LUO Huilan, CHEN Han. Spatial-Temporal Convolutional Attention Network for Action Recognition [J]. Computer Engineering and Applications, 2023, 59(9): 150-158. |
[4] | ZHENG Yutong, SUN Haoying, SONG Wei. Hybrid Samples Image Dehazing via Latent Space Translation [J]. Computer Engineering and Applications, 2023, 59(9): 225-236. |
[5] | DAI Chao, LIU Ping, SHI Juncai, REN Hongjie. Regularized Extraction of Remotely Sensed Image Buildings Using U-Shaped Networks [J]. Computer Engineering and Applications, 2023, 59(8): 105-116. |
[6] | LIU Hualing, PI Changpeng, ZHAO Chenyu, QIAO Liang. Review of Cross-Domain Object Detection Algorithms Based on Depth Domain Adaptation [J]. Computer Engineering and Applications, 2023, 59(8): 1-12. |
[7] | HE Jiafeng, CHEN Hongwei, LUO Dehan. Review of Real-Time Semantic Segmentation Algorithms for Deep Learning [J]. Computer Engineering and Applications, 2023, 59(8): 13-27. |
[8] | ZHANG Yanqing, MA Jianhong, HAN Ying, CAO Yangjie, LI Jie, YANG Cong. Review of Research on Real-World Single Image Super-Resolution Reconstruction [J]. Computer Engineering and Applications, 2023, 59(8): 28-40. |
[9] | WEI Jian, ZHAO Xu, LI Lianpeng. Siamese Network Weak Target Tracking Algorithm Fused with Location Information Attention [J]. Computer Engineering and Applications, 2023, 59(7): 198-206. |
[10] | ZHAO Hongwei, ZHENG Jiajun, ZHAO Xinxin, WANG Shengchun, LI Yidong. Rail Surface Defect Method Based on Bimodal-Modal Deep Learning [J]. Computer Engineering and Applications, 2023, 59(7): 285-293. |
[11] | WANG Jing, JIN Yuchu, GUO Ping, HU Shaoyi. Survey of Camera Pose Estimation Methods Based on Deep Learning [J]. Computer Engineering and Applications, 2023, 59(7): 1-14. |
[12] | JIANG Yuying, CHEN Xinyu, LI Guangming, WANG Fei, GE Hongyi. Graph Neural Network and Its Research Progress in Field of Image Processing [J]. Computer Engineering and Applications, 2023, 59(7): 15-30. |
[13] | ZHOU Yurong, ZHANG Qiaoling, YU Guangzeng, XU Weiqiang. Review of Acoustic Signal-Based Industrial Equipment Fault Diagnosis [J]. Computer Engineering and Applications, 2023, 59(7): 51-63. |
[14] | LYU Xiaoling, YANG Shengyue, ZHANG Minglu, LIANG Ming, WANG Junchao. Improved Fisheye Image Target Detection Algorithm Based on YOLOv5 Network [J]. Computer Engineering and Applications, 2023, 59(6): 241-250. |
[15] | PENG Pei, ZHANG Meiling, ZHENG Dong. Side Channel Attack Fused with CNN_LSTM [J]. Computer Engineering and Applications, 2023, 59(6): 268-276. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||