计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (21): 203-213.DOI: 10.3778/j.issn.1002-8331.2407-0405

• 模式识别与人工智能 • 上一篇    下一篇

基于分层和随机策略的鲁棒随机森林

刘昭仪,王士同   

  1. 江南大学 人工智能与计算机学院,江苏 无锡 214122
  • 出版日期:2025-11-01 发布日期:2025-10-31

Robust Random Forests Based on Hierarchical and Random Strategy

LIU Zhaoyi, WANG Shitong   

  1. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2025-11-01 Published:2025-10-31

摘要: 近年来,可解释的树模型在对抗性攻击下的鲁棒性受到极大关注。尽管已经取得了显著的进展,但仍存在着一些问题。树模型的鲁棒训练算法虽然有效提升了模型鲁棒性,但同时伴随着训练成本的增加以及自然准确率的下降,呈现出与对抗性训练相似的局限性。为了解决这一挑战,针对随机森林模型进行实验和分析,提出一种基于分层和随机策略的鲁棒随机森林算法,该算法使用高斯衰减函数根据树的深度分层衰减分割算法中引入的扰动值,逐层递减扰动带来的负面影响,并引入基于伯努利分布的随机策略控制森林中树的划分过程,以协调模型在自然场景和对抗性攻击下的性能。旨在找到一种平衡策略,既能提升模型在面对对抗性攻击时的鲁棒性,又维持其在标准数据集上的高性能表现。在真实世界数据集上的大量实验表明,提出的算法能够在确保模型具备一定的鲁棒性的同时,显著减轻由鲁棒训练导致的准确率损失,提高模型的泛化能力。

关键词: 随机森林, 对抗训练, 鲁棒性, 机器学习

Abstract: In recent years, the robustness of explainable tree models against adversarial attacks has garnered significant attention. Despite notable progress, several challenges remain. Robust training algorithms for tree models, although effective in enhancing model robustness, also lead to increased training costs and decreased natural accuracy, reflecting the similar limitation as adversarial training. To address this challenge, experiments and analyses are conducted on random forest models, resulting in the proposal of a robust random forest algorithm that is grounded in a hierarchical and random strategy. This algorithm employs a Gaussian attenuation function to hierarchically attenuate the perturbation values introduced in the splitting algorithm according to the depth of the tree, thereby progressively reducing the negative impact of perturbations. Additionally, a random strategy based on Bernoulli distribution is introduced to control the splitting process of trees in the forest, aiming to balance the model’s performance under both natural scenarios and adversarial attacks. The goal is to find a strategy that enhances the model’s robustness against adversarial attacks while maintaining high performance on standard datasets. Extensive experiments on real-world datasets demonstrate that the proposed algorithm can ensure a certain level of robustness while significantly mitigating the accuracy loss caused by robust training, thus improving the model’s generalization ability.

Key words: random forest, adversarial training, robustness, machine learning