计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (3): 246-254.DOI: 10.3778/j.issn.1002-8331.2209-0144

• 图形图像处理 • 上一篇    下一篇

融合多层次决策信息的视觉解释方法

杨传栋,钱立志,孙子文,陈栋,凌冲   

  1. 中国人民解放军 陆军炮兵防空兵学院 兵器工程系,合肥 230031
  • 出版日期:2024-02-01 发布日期:2024-02-01

Visual Explanation Method Combining Hierarchical Decision Information

YANG Chuandong, QIAN Lizhi, SUN Ziwen, CHEN Dong, LING Chong   

  1. Department of Ordnance Engineering, PLA Army Academy of Artillery and Air Defense, Hefei 230031, China
  • Online:2024-02-01 Published:2024-02-01

摘要: 视觉解释方法是深度神经网络可解释领域的热门研究课题,但现有方法未能有效利用多层次决策信息导致视觉解释效果差。针对此问题,提出一种融合多层次决策信息的视觉解释方法。挖掘特征图中高细粒度局部层次的决策信息生成一组与决策结果相关性强的加权特征图,采用定序分组方式对其合并,获取一组低冗余度掩码;采取模糊边界和积分方法对掩码进行处理,基于全局层次的决策贡献并行计算分组掩码重要度分数,提高了算法对全局决策信息的敏感性和算法速度。通过消融实验确定了算法的最优参数组合,并在ImageNet数据集上与现有的先进视觉解释方法进行了定性和定量比较。实验结果表明:该方法通过结合多层次决策信息,在置信度测试和定位测试中取得了更好的视觉解释结果,且耗时达到68 ms。

关键词: 视觉解释, 多层次信息, 全局决策贡献, 类激活映射(CAM), 积分方法

Abstract: Visual explanation method is a hot research topic in the interpretable field of DNN, but the existing methods do not effectively combine hierarchical decision information, resulting in poor visual explanation effect. To solve this problem, this paper proposes a method combining hierarchical decision information. Firstly, the high-fine-grained local decision information is mined to generate weighted feature maps that are highly correlated with the decision results, and the sequential grouping method is used to merge them to obtain masks. Then the masks are processed by fuzzy boundary and integral method. And the importance scores based on global decision contribution are calculated in parallel, which improves the sensitivity of the algorithm to the global decision information and the speed of the algorithm. Finally, the optimal parameters are determined through ablation studies, and the method is qualitatively and quantitatively compared with the existing advanced methods on ImageNet dataset. The experimental results show that by combining hierarchical decision information, the proposed method achieves better results in the confidence test and location test, and it takes up to 68 ms.

Key words: visual explanation, hierarchical information, global decision contribution, class activation mapping (CAM), integral method