Content of Generative Adversarial Networks in our journal

        Published in last 1 year |  In last 2 years |  In last 3 years |  All
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Survey of Generative Adversarial Networks
    SUN Shukui, FAN Jing, QU Jinshuai, LU Peidong
    Computer Engineering and Applications    2022, 58 (18): 90-103.   DOI: 10.3778/j.issn.1002-8331.2205-0097
    Abstract73)      PDF(pc) (1137KB)(45)       Save
    With its strong adversary learning ability, generative adversarial networks(GAN) is favored by more and more researchers in many fields. This paper expounds the development background, framework and objective function of GAN, analyzes the causes of pattern collapse and gradient disappearance in the training process, and introduces in detail the GAN derived model proposed through the change of architecture and the modification of objective function. Then, it summarizes some standards used to evaluate the quality and diversity of generated images, and summarizes the wide application of GAN in different fields, Finally, this paper summarizes and puts forward some prospects for the future research direction in this field.
    Reference | Related Articles | Metrics
    Image Animation Stylization Based on Generative Adversarial Network
    WANG Yifan, ZHAO Leyi, LI Yi
    Computer Engineering and Applications    2022, 58 (18): 104-110.   DOI: 10.3778/j.issn.1002-8331.2109-0484
    Abstract99)      PDF(pc) (1160KB)(58)       Save
    The current cartoon style image generation methods still have limitations, such as the unrealistic color, inadequate processing of local details of the picture, and so on. In order to quickly convert the input image into the style of animation, you need to combine deep learning. Based on the idea of generative adversarial network, the proposed algorithm is a generative adversarial network of animation stylized coding, which transforms the input image style into Hayao Miyazaki’s animated film style. The network structure has been greatly optimized by adding the adaptive instance normalization layer(AdaIN) module and the multi-layer perceptron(MLP) module, while improving the experimental effect. In the loss function part, learned perceptual image patch similarity(LPIPS) is introduced as the content loss function, and the binary cross entropy loss function(BCELoss) is used as the adversarial loss function. Experimental results show that the network has a very good effect on animated pictures, with an FID score of 72, which can be flexibly applied to various types of pictures animating.
    Reference | Related Articles | Metrics
    Generation and Classification of Skin Cancer Images Based on Self-Attention StyleGAN
    ZHAO Chen, SHUAI Renjun, MA Li, LIU Wenjia, WU Menglin
    Computer Engineering and Applications    2022, 58 (18): 111-121.   DOI: 10.3778/j.issn.1002-8331.2102-0092
    Abstract188)      PDF(pc) (1452KB)(59)       Save
    Aiming at the problem of skin cancer classification tasks represented by melanoma, there is an imbalance in the number and weight of various samples in the data set, and the poor quality of skin cancer samples generated by the existing confrontation generation network makes it difficult to distinguish in clinical diagnosis. A skin cancer image sample generation and classification framework based on self-attention-based style generation confrontation network(Self-Attention-StyleGAN) combined with SE-ResNeXt-50. This framework introduces a self-attention mechanism on the basis of StyleGAN, redesigns the generator’s style control and noise input structure, and reconstructs the discriminator to adjust the image generator. Effectively synthesize high-quality skin damage images. Skin cancer sample images are classified by using SE-ResNeXt-50 to better extract the information of different hierarchical feature maps of sample images, thereby improving the balanced multi-class accuracy(BMA). The experimental results show that the sample image quality generated by this model on the ISIC2019 skin cancer dataset is high, and the classification BMA reaches 94.71%. This method improves the accuracy of skin lesion image classification, helps dermatologists to judge and diagnose different types of skin lesions, and analyzes different stages and difficult to distinguish skin lesions.
    Reference | Related Articles | Metrics
    Generating Face from Voice:Method of Voice-Driven Static and Dynamic Face Generation
    ZHAO Lulu, CHEN Yanxiang, ZHAO Pengcheng, ZHU Yupeng, SHENG Zhentao
    Computer Engineering and Applications    2022, 58 (18): 122-129.   DOI: 10.3778/j.issn.1002-8331.2101-0318
    Abstract71)      PDF(pc) (851KB)(27)       Save
    Voice-driven face generation aims to explore the static and dynamic correlation between voice fragments and faces, so that it can generate corresponding face images from a given voice fragment. However, most of the existing research only consider one of the correlations. In addition, the methods on static face generation rely on time-aligned audio-visual data strictly, which limits the use of such static models to a certain extent. Therefore, a voice-driven static and dynamic face generation model(SDVF-GAN) is proposed based on conditional generative adversarial networks. SDVF-GAN builds a voice encoder network, which obtains more accurate auditory feature through the self-attention mechanism. Both the static and dynamic generation network takes these auditory features as input. For static generation network, it uses image discriminator based on the projection layer, which ensures it can synthesize static face images with consistent attributes(age, gender) and high quality. The dynamic generation network uses the image discriminator and attention-based lip discriminator to generate a sequence of dynamic face image with lip synchronization. In experiment, the authors constructs an attribute-aligned Voice-Face dataset to optimize the parameters of static model and uses the existing LRW dataset to train the dynamic model. The results demonstrate that model studies the attribute correspondence and lip synchronization relationship between voice and face comprehensively, it can generate face images with higher quality and stronger correlation and synchronization.
    Reference | Related Articles | Metrics
    Emotional Dialogue Response Generation Based on Generative Adversarial Network
    LI Kaiwei, MA Li
    Computer Engineering and Applications    2022, 58 (18): 130-136.   DOI: 10.3778/j.issn.1002-8331.2101-0329
    Abstract89)      PDF(pc) (948KB)(82)       Save
    In recent years, with the development of neural network technology and natural language processing technology, the research on dialogue generation based on deep neural networks has made breakthrough progress, making human-machine dialogue systems widely used in life to provide convenience, such as e-commerce customer service, voice assistant, etc. However, existing models tend to produce general answers and generally lack emotional factors. To solve this problem, this paper proposes an emotional dialogue content generation model based on a generative confrontation network—EC-GAN(emotional conversation generative adversarial network), which generates more meaningful and customizable emotional responses by combining multi-index rewards and emotional editing constraints. For the generator, the Seq2Seq model is used to generate responses, accept the rewards of the discriminator, and guide the generated sentence to improve diversity and emotional richness. For the discriminator, dual discriminator and the content discriminator is used to determine whether the reply belongs to general response, the emotion discriminator discriminates the consistency between the emotion of the generated sentence and the specified emotion category, and the discriminating result is fed back to the generator to guide the generation of the reply. The emotional changes between input and response are paid attention to and the directionality of interactive emotional resonance is verified. Experiments in NLPCC 2017 Shared Task 4—emotional conversation generation show that the model can not only improve the fluency and diversity of responses, but also significantly increase the emotional richness.
    Reference | Related Articles | Metrics