[1] LIU Y C, SHU Z X, LI Y J, et al. Content-aware GAN compression[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 12151-12161.
[2] YANG Z, HH Z, SALAKHUTDINOV R, et al. Improved variational autoencoders for text modeling using dilated convolutions[C]//Proceedings of the 34th International Conference on Machine Learning, 2017: 3881-3890.
[3] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[J]. arXiv:1409. 0473, 2014.
[4] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144.
[5] REED S, AKATA Z, YAN X, et al. Generative adversarial text to image synthesis[C]//Proceedings of the 33rd International Conference on Machine Learning, 2016: 1060-1069.
[6] ZHANG H, XU T, LI H S, et al. StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 5908-5916.
[7] ZHANG H, XU T, LI H S, et al. StackGAN++: realistic image synthesis with stacked generative adversarial networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(8): 1947-1962.
[8] XU T, ZHANG P C, HUANG Q Y, et al. AttnGAN: fine-grained text to image generation with attentional generative adversarial networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 1316-1324.
[9] REED S, AKATA Z, MOHAN S, et al. Learning what and where to draw[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016: 217-225.
[10] TAN H C, LIU X P, LI X, et al. Semantics-enhanced adversarial nets for text-to-image synthesis[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 10500-10509.
[11] LI B, QI X, LUKASIEWICZ T, et al. Controllable text-to-image generation[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019: 2065-2075.
[12] ZHU M F, PAN P B, CHEN W, et al. DM-GAN: dynamic memory generative adversarial networks for text-to-image synthesis[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5795-5803.
[13] QIAO T T, ZHANG J, XU D Q, et al. MirrorGAN: learning text-to-image generation by redescription[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 1505-1514.
[14] CHENG J, WU F X, TIAN Y L, et al. RiFeGAN: rich feature generation for text-to-image synthesis from prior knowledge[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10908-10917.
[15] YANG Y H, WANG L, XIE D, et al. Multi-sentence auxiliary adversarial networks for fine-grained text-to-image synthesis[J]. IEEE Transactions on Image Processing, 2021, 30: 2798-2809.
[16] LIAO W T, HU K, YANG M Y, et al. Text to image generation with semantic-spatial aware GAN[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 18166-18175.
[17] 莫建文, 徐凯亮, 林乐平, 等. 结合互信息最大化的文本到图像生成方法[J]. 西安电子科技大学学报, 2019, 46(5): 180-188.
MO J W, XU K L, LIN L P, et al. Text-to-image generation combined with mutual information maximization[J]. Journal of Xidian University, 2019, 46(5): 180-188.
[18] 孙钰, 李林燕, 叶子寒, 等. 多层次结构生成对抗网络的文本生成图像方法[J]. 计算机应用, 2019, 39(11): 3204-3209.
SUN Y, LI L Y, YE Z H, et al. Text-to-image synthesis method based on multi-level structure generative adversarial networks[J]. Journal of Computer Applications, 2019, 39(11): 3204-3209.
[19] YIN G J, LIU B, SHENG L, et al. Semantics disentangling for text-to-image generation[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 2322-2331.
[20] LI J F, WEN Y, HE L H. SCConv: spatial and channel reconstruction convolution for feature redundancy[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 6153-6162.
[21] WU Y, HE K. Group normalization[C]//Proceedings of the European Conference on Computer Vision, 2018: 3-19.
[22] SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681.
[23] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
[24] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2818-2826.
[25] SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520.
[26] BALLES L, HENNIG P. Dissecting adam: the sign, magnitude and variance of stochastic gradients[C]//Proceedings of the International Conference on Machine Learning, 2018: 404-413.
[27] WAH C, BRANSON S, WELINDER P, et al. The Caltech-USCD Birds-200-2011 dataset[R]. California Institute of Technology, 2011.
[28] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer, 2014: 740-755.
[29] SALIMANS T, GOODFELLOW I, ZAREMBA W, et al. Improved techniques for training GANs[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016: 2234-2242.
[30] HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. GANs trained by a two time-scale update rule converge to a local nash equilibrium[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 6629-6640.
[31] VAN ERVEN T, HARREMOS P. Rényi divergence and kullback-leibler divergence[J]. IEEE Transactions on Information Theory, 2014, 60(7): 3797-3820.
[32] TAO M, TANG H, WU F, et al. DF-GAN: a simple and effective baseline for text-to-image synthesis[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 16494-16504.
[33] PENG D L, YANG W C, LIU C, et al. SAM-GAN: self-attention supporting multi-stage generative adversarial networks for text-to-image synthesis[J]. Neural Networks, 2021, 138: 57-67.
[34] ZHANG Z X, SCHOMAKER L. DiverGAN: an efficient and effective single-stage framework for diverse text-to-image generation[J]. Neurocomputing, 2022, 473: 182-198.
[35] TAN H C, LIU X P, YIN B C, et al. Cross-modal semantic matching generative adversarial networks for text-to-image synthesis[J]. IEEE Transactions on Multimedia, 2021, 24: 832-845.
[36] LAZCANO D, FRANCO N F, CREIXELL W. HGAN: hyperbolic generative adversarial network[J]. IEEE Access, 2021, 9: 96309-96320.
[37] QU E, ZOU D. Autoencoding hyperbolic representation for adversarial generation[J]. arXiv: 2201. 12825, 2022.
[38] KANG M, ZHU J Y, ZHANG R, et al. Scaling up GANs for text-to-image synthesis[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 10124-10134. |