Optimized Gradient Boosting Black-Box Adversarial Attack Algorithm

doi:10.3778/j.issn.1002-8331.2205-0051

Abstract

Abstract: Adversarial examples can make deep neural networks output wrong results with higher confidence. Adversarial examples are divided into white-box attacks and black-box attacks. White-box attacks have achieved a high success rate at present, while black-box attacks have a low attack success rate due to unknown models and parameters. In order to improve the success rate of black-box attacks, this paper proposes a optimized gradient boosting black-box adversarial attack algorithm. Firstly, the method in this paper uses the mixed image method to mix the image samples of other categories and obtain the mixed gradient with the information of other categories. Secondly, the gradient variance in the last iteration process is used to adjust the gradient of the current image sample to obtain the optimized gradient. Then, the optimized gradient is combined with the Adam optimization algorithm to perform iterative optimization to generate highly transferable adversarial examples. Experiments on the ImageNet dataset show that the proposed algorithm can effectively improve the black-box attack of adversarial examples. The average attack success rate of single model attack and integrated model attack is 71.7% and 88.3% respectively. The average attack success rate has reached 96.8% after the fusion of three transform-based anti-attack algorithms. In addition, the average success rate of attacking the five existing adversarial defense models is 92.7%, which is better than the current attack method based on input transformation and gradient attack method.

Key words: adversarial examples, deep neural network, black-box attack, optimized gradient, transferability

摘要： 对抗样本能够使得深度神经网络以较高置信度输出错误的结果。对抗样本分为白盒攻击和黑盒攻击，白盒攻击目前达到了较高的成功率，而黑盒攻击由于对模型、参数的未知，导致现有黑盒攻击方法的攻击成功率还较低。为了进一步提高黑盒攻击的成功率，提出了一种优化梯度增强黑盒对抗攻击算法。使用混合图像的方式去混合其他类别的图像样本，从而得到混合了其他类别信息的混合梯度。使用上一次迭代过程中的梯度方差去调整当前图像样本的梯度，得到优化梯度。将优化梯度与Adam优化算法结合进行迭代优化生成可迁移性强的对抗样本。在ImageNet数据集上进行了实验，结果表明所提算法能有效提升对抗样本的黑盒攻击性。在单模型攻击和集成模型攻击中的平均攻击成功率分别为71.7%和88.3%，融合了三个基于转换的对抗攻击算法后平均攻击成功率则达到了96.8%。此外，对现有的5个对抗防御模型进行攻击能够实现92.7%的平均成功率，优于当前基于输入转换的攻击方法以及基于梯度的攻击方法。

关键词: 对抗样本, 深度神经网络, 黑盒攻击, 优化梯度, 可迁移性

LIU Mengting, LING Jie. Optimized Gradient Boosting Black-Box Adversarial Attack Algorithm[J]. Computer Engineering and Applications, 2023, 59(18): 260-267.

刘梦庭, 凌捷. 优化梯度增强黑盒对抗攻击算法[J]. 计算机工程与应用, 2023, 59(18): 260-267.

References

[1] LI H，HUANG H，CHEN L，et al.Adversarial examples for CNN-based SAR image classification：an experience study[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing，2020，14：1333-1347.
[2] SHEN M，YU H，ZHU L，et al.Effective and robust physical-world attacks on deep learning face recognition systems[J].IEEE Transactions on Information Forensics and Security，2021，16：4063-4077.
[3] XIONG Z，XU H，LI W，et al.Multi-source adversarial sample attack on autonomous vehicles[J].IEEE Transactions on Vehicular Technology，2021，70（3）：2822-2835.
[4] GOODFELLOW I J，SHLENS J，SZEGEDY C.Explaining and harnessing adversarial examples[J].arXiv：1412.6572，2014.
[5] KURAKIN A，GOODFELLOW I J，BENGIO S.Adversarial examples in the physical world[M]//Artificial intelligence safety and security.[S.l.]：Chapman and Hall/CRC，2018：99-112.
[6] DONG Y，LIAO F，PANG T，et al.Boosting adversarial attacks with momentum[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：9185-9193.
[7] SHI Y，WANG S，HAN Y.Curls & whey：boosting black-box adversarial attacks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：6519-6527.
[8] ZHONG Y，DENG W.Towards transferable adversarial attack against deep face recognition[J].IEEE Transactions on Information Forensics and Security，2020，16：1452-1466.
[9] LIN J，SONG C，HE K，et al.Nesterov accelerated gradient and scale invariance for adversarial attacks[C]//International Conference on Learning Representations，2020.
[10] XIE C，ZHANG Z，ZHOU Y，et al.Improving transferability of adversarial examples with input diversity[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：2730-2739.
[11] DONG Y，PANG T，SU H，et al.Evading defenses to transferable adversarial examples by translation-invariant attacks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：4312-4321.
[12] WANG X，HE K.Enhancing the transferability of adversarial attacks through variance tuning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2021：1924-1933.
[13] WANG X，HE X，WANG J，et al.Admix：enhancing the transferability of adversarial attacks[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision，2021：16158-16167.
[14] KINGMA D P，BA J.Adam：a method for stochastic optimization[J].arXiv：1412.6980，2014.
[15] RUSSAKOVSKY O，DENG J，SU H，et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision，2015，115（3）：211-252.
[16] SZEGEDY C，VANHOUCKE V，IOFFE S，et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：2818-2826.
[17] SZEGEDY C，IOFFE S，VANHOUCKE V，et al.Inception-v4，inception-resnet and the impact of residual connections on learning[C]//Thirty-first AAAI Conference on Artificial Intelligence，2017.
[18] HE K，ZHANG X，REN S，et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：770-778.
[19] LIAO F，LIANG M，DONG Y，et al.Defense against adversarial attacks using high-level representation guided denoiser[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：1778-1787.
[20] XIE C，WANG J，ZHANG Z，et al.Mitigating adversarial effects through randomization[C]//International Conference on Learning Representations，2018.
[21] XU W，EVANS D，QI Y.Feature squeezing：detecting adversarial examples in deep neural networks[J].arXiv：1704.
01155，2017.
[22] GUO C，RANA M，CISSE M，et al.Countering adversarial images using input transformations[C]//International Conference on Learning Representations，2018.
[23] LIU Z，LIU Q，LIU T，et al.Feature distillation：DNN-oriented JPEG compression against adversarial examples[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），2019：860-868.
[24] TRAMèR F，KURAKIN A，PAPERNOT N，et al.Ensemble adversarial training：Attacks and defenses[C]//6th International Conference on Learning Representations，2018.