计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (23): 61-70.DOI: 10.3778/j.issn.1002-8331.2108-0147

• 热点与综述 • 上一篇    下一篇

基于深度神经网络的对抗样本技术综述

白祉旭,王衡军,郭可翔   

  1. 战略支援部队信息工程大学,郑州 450001
  • 出版日期:2021-12-01 发布日期:2021-12-02

Summary of Adversarial Examples Techniques Based on Deep Neural Networks

BAI Zhixu, WANG Hengjun, GUO Kexiang   

  1. Strategic Support Force Information Engineering University, Zhengzhou 450001, China
  • Online:2021-12-01 Published:2021-12-02

摘要:

深度学习在完成一些难度极高的任务中展现了惊人的能力,但深度神经网络难以避免对刻意添加了扰动的样本(称为“对抗样本”)进行错误的分类。“对抗样本”逐渐成为深度学习安全领域的研究热点。研究对抗样本产生的原因和作用机理,有助于从安全性和鲁棒性方面优化模型。在掌握对抗样本原理的基础上,对经典对抗样本攻击方法进行分类总结,根据不同的攻击原理将攻击方法分为白盒攻击与黑盒攻击两个大类,并引入非特定目标攻击、特定目标攻击、全像素添加扰动攻击和部分像素添加扰动攻击等细类。在ImageNet数据集上对几种典型攻击方法进行复现,通过实验结果,比较几种生成方法的优缺点,分析对抗样本生成过程中的突出问题。并对对抗样本的应用和发展作了展望。

关键词: 深度神经网络, 对抗样本, 白盒攻击, 黑盒攻击, 鲁棒性

Abstract:

Deep learning has shown amazing capabilities in completing several extremely difficult tasks, but deep neural networks can hardly avoid misclassifying examples with deliberately added perturbations(called “adversarial examples”).“Adversarial examples” are becoming a popular research topic in the security field of deep learning. Studying the causes and mechanisms of adversarial examples can help optimize models in terms of security and robustness. Based on the principle of adversarial examples, the classical adversarial example attack methods are classified and summarized, and the attack methods are divided into two major categories: white-box attacks and black-box attacks, and subcategories such as non-specific target attacks, specific target attacks, full-pixel additive perturbation attacks and partial-pixel additive perturbation attacks are introduced. Several typical attack methods are reproduced on the ImageNet dataset, and the experimental results are used to compare the advantages and disadvantages of several generation methods, analyze the outstanding problems in the generation of the adversarial examples, and make an outlook on the application and development of the adversarial examples.

Key words: deep neural network, adversarial example, white-box attack, black-box attack, robustness