Design of DAPGD of Adversarial Attack Algorithm Against Deepfake

doi:10.3778/j.issn.1002-8331.2110-0026

Abstract

Abstract: An improved adversarial example generation algorithm, dynamic APGD（DAPGD） is proposed to protect the images from the tampering of deepfake models. Adversarial examples generated by DAPGD make the output of the deepfake models significantly distorted so that the forged images cannot be generated effectively. DAPGD uses the idea of the adaptive decay learning rate, which can accelerate the algorithm convergence and improve the quality of adversarial examples. Meanwhile, the checkpoint for decaying the learning rate is dynamically set to address the problem that APGD tends to miss the best time to decay the learning rate. It can play the role of decaying the learning rate more thoroughly. Finally, as the loss function is unstable due to the use of random parameters in deepfake models, the local early stopping mechanism of APGD is eliminated to improve the effectiveness and speed of the algorithm. DAPGD adversarial attack experiments are conducted for three mainstream deepfake models and compared with the original algorithm and other algorithms. The results show that the adversarial examples generated by DAPGD can achieve better results in both output distortion size and attack success rate, and can interfere with deepfake models forgery images more effectively.

Key words: deepfake, adversarial example, learning rate decay, dynamic checkpoint, early stopping

摘要： 为了防范利用深度伪造模型伪造图片，提出了一种改进的对抗样本生成算法即动态APGD（dynamic APGD，DAPGD），通过制作对抗样本替代原始图片，使深伪模型的输出产生明显失真，从而无法有效地生成伪造图片。DAPGD使用自适应衰减学习率的思路，能加快算法收敛速度，提升收敛时对抗样本的质量；同时针对APGD容易错过最佳衰减学习率时机的问题，动态地设置用于衰减学习率的检查点，能更彻底地发挥学习率衰减的作用。由于深伪模型使用随机参数导致损失函数不稳定，取消了APGD的局部早停机制，提升算法的效果与速度。针对三个主流深度伪造模型进行DAPGD攻击实验，并与原算法及其他算法进行对比，结果表明，DAPGD生成的对抗样本在输出失真大小、攻击成功比例两个指标上均能取得更好的效果，能更有效地干扰深伪模型伪造图片。

关键词: 深度伪造, 对抗样本, 学习率衰减, 动态检查点, 早停

QIU Haoxuan, DU Yanhui, LU Tianliang. Design of DAPGD of Adversarial Attack Algorithm Against Deepfake[J]. Computer Engineering and Applications, 2022, 58(24): 97-106.

裘昊轩, 杜彦辉, 芦天亮. 针对深度伪造的对抗攻击算法动态APGD设计[J]. 计算机工程与应用, 2022, 58(24): 97-106.

References

[1] CHOI Y，CHOI M，KIM M，et al.StarGAN：unified generative adversarial networks for multi-domain image-to-image translation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：8789-8797.
[2] CHOI Y，UH Y，YOO J，et al.StarGAN v2：diverse image synthesis for multiple domains[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：8188-8197.
[3] ISOLA P，ZHU J Y，ZHOU T，et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：1125-1134.
[4] PUMAROLA A，AGUDO A，MARTINEZ A M，et al.GANimation：anatomically-aware facial animation from a single image[C]//European Conference on Computer Vision.Cham：Springer，2018：818-833.
[5] WANG T C，LIU M Y，ZHU J Y，et al.High-resolution image synthesis and semantic manipulation with conditional GANs[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：8798-8807.
[6] ZAKHAROV E，SHYSHEYA A，BURKOV E，et al.Few-shot adversarial learning of realistic neural talking head models[C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV），2019：9459-9468.
[7] ZHU J Y，PARK T，ISOLA P，et al.Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：2223-2232.
[8] CHESNEY R，CITRON D.Deepfakes and the new disinformation war[J].Foreign Affairs，2019，98（1）：147-155.
[9] WANG R，JUEFEI-XU F，MA L，et al.FakeSpotter：a simple yet robust baseline for spotting AI-synthesized fake faces[C]//29th International Joint Conference on Artificial Intelligence，2020：3444-3451.
[10] WANG S Y，WANG O，ZHANG R，et al.CNN-generated images are surprisingly easy to spot...for now[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：8695-8704.
[11] COZZOLINO D，POGGI G，VERDOLIVA L.Recasting residual-based local descriptors as convolutional neural networks：an application to image forgery detection[J].arXiv：1703.04615，2017.
[12] MCCLOSKEY S，CHEN C，YU J.Focus manipulation detection via photometric histogram analysis[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），2018：1674-1682.
[13] SZEGEDY C，ZAREMBA W，SUTSKEVER I，et al.Intriguing properties of neural networks[C]//2nd International Conference on Learning Representations，2014.
[14] NGUYEN A，YOSINSKI J，CLUNE J.Deep neural networks are easily fooled：high confidence predictions for unrecognizable images[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2015：427-436.
[15] FAWZI A，MOOSAVI-DEZFOOLI S M，FROSSARD P.Robustness of classifiers：from adversarial to random noise[J].Advances in Neural Information Processing Systems，2016，29：1632-1640.
[16] FAWZI A，FAWZI O，FROSSARD P.Analysis of classifiers’ robustness to adversarial perturbations[J].Machine Learning，2018，107（3）：481-508.
[17] GOODFELLOW I J，SHLENS J，SZEGEDY C.Explaining and harnessing adversarial examples[C]//International Conference on Learning Representations，2015.
[18] YEH C Y，CHEN H W，TSAI S L，et al.Disrupting image-translation-based deepfake algorithms with adversarial attacks[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops，2020：53-62.
[19] M?DRY A，MAKELOV A，SCHMIDT L，et al.Towards deep learning models resistant to adversarial attacks[J].arXiv：1706.06083，2017.
[20] LIU P，SUN L，MAO X Q，et al.A CycleGAN adversarial attack method based on output diversification initialization[J].Journal of Physics：Conference Series，2021，1948（1）：012041.
[21] HUANG Q，ZHANG J，ZHOU W，et al.Initiative defense against facial manipulation[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2021：1619-1627.
[22] KURAKIN A，GOODFELLOW I，BENGIO S.Adversarial examples in the physical world[J].arXiv：1607.02533，2016.
[23] CROCE F，HEIN M.Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks[C]//International Conference on Machine Learning，2020：2206-2216.
[24] RUIZ N，BARGAL S A，SCLAROFF S.Disrupting deepfakes：adversarial attacks against conditional image translation networks and facial manipulation systems[C]//European Conference on Computer Vision.Cham：Springer，2020：236-251.