可解释人工智能研究综述

doi:10.3778/j.issn.1002-8331.2208-0322

摘要/Abstract

摘要： 随着机器学习和深度学习的发展，人工智能技术已经逐渐应用在各个领域。然而采用人工智能的最大缺陷之一就是它无法解释预测的依据。模型的黑盒性质使得在医疗、金融和自动驾驶等关键任务应用场景中人类还无法真正信任模型，从而限制了这些领域中人工智能的落地应用。推动可解释人工智能（explainable artificial intelligence，XAI）的发展成为实现关键任务应用落地的重要问题。目前，国内外相关领域仍缺少有关可解释人工智能的研究综述，也缺乏对因果解释方法的关注以及对可解释性方法评估的研究。从解释方法的特点出发，将主要可解释性方法分为三类：独立于模型的方法、依赖于模型的方法和因果解释方法，分别进行总结分析，对解释方法的评估进行总结，列举出可解释人工智能的应用，讨论当前可解释性存在的问题并进行展望。

关键词: 可解释性, 人工智能, 机器学习, 深度学习, 评估

Abstract: With the development of machine learning and deep learning, artificial intelligence technology has been gradually applied in various fields. However, one of the biggest drawbacks of adopting AI is its inability to explain the basis for predictions. The black-box nature of the models makes it impossible for humans to truly trust them yet in mission-critical application scenarios such as healthcare, finance, and autonomous driving, thus limiting the grounded application of AI in these areas. Driving the development of explainable artificial intelligence（XAI） has become an important issue for achieving mission-critical applications on the ground. At present, there is still a lack of research reviews on XAI in related fields at home and abroad, as well as a lack of studies focusing on causal explanation methods and the evaluation of explainable methods. Therefore, this study firstly starts from the characteristics of explanatory methods and divides the main explainable methods into three categories：model-independent methods, model-dependent methods, and causal explanation methods from the perspective of explanation types, and summarizes and analyzes them respectively, then summarizes the evaluation of explanation methods, lists the applications of explainable AI, and finally discusses the current problems of explainability and provides an outlook.

Key words: explainability, artificial intelligence, machine learning, deep learning, evaluation

赵延玉, 赵晓永, 王磊, 王宁宁. 可解释人工智能研究综述[J]. 计算机工程与应用, 2023, 59(14): 1-14.

ZHAO Yanyu, ZHAO Xiaoyong, WANG Lei, WANG Ningning. Review of Explainable Artificial Intelligence[J]. Computer Engineering and Applications, 2023, 59(14): 1-14.

参考文献

[1] ADADI A，BERRADA M.Peeking inside the black-box：a survey on explainable artificial intelligence（XAI）[J].IEEE Access，2018，6：52138-52160.
[2] ZHANG Q，ZHU S C.Visual interpretability for deep learning：a survey[J].Frontiers of Information Technology & Electronic Engineering，2018，19（1）：27-39.
[3] ZHANG Y，TI?O P，LEONARDIS A，et al.A survey on neural network interpretability[J].IEEE Transactions on Emerging Topics in Computational Intelligence，2021，5（5）：726-742.
[4] 苏炯铭，刘鸿福，项凤涛，等.深度神经网络解释方法综述[J].计算机工程，2020，46（9）：1-15.
SU J M，LIU H F，XIANG F T，et al.Survey of interpretation methods for deep neural networks[J].Computer Engineering，2020，46（9）：1-15.
[5] 化盈盈，张岱墀，葛仕明.深度学习模型可解释性的研究进展[J].信息安全学报，2020，5（3）：1-12.
HUA Y Y，ZHANG D C，GE S M.Research progress in the interpretability of deep learning models[J].Journal of Cyber Security，2020，5（3）：1-12.
[6] 曾春艳，严康，王志锋，等.深度学习模型可解释性研究综述[J].计算机工程与应用，2021，57（8）：1-9.
ZENG C Y，YAN K，WANG Z F，et al.Survey of interpretability research on deep learning models[J].Computer Engineering and Applications，2021，57（8）：1-9.
[7] SPEITH T.A review of taxonomies of explainable artificial intelligence（XAI） methods[C]//2022 ACM Conference on Fairness，Accountability，and Transparency，2022：2239-2250.
[8] MINH D，WANG H X，LI Y F，et al.Explainable artificial intelligence：a comprehensive review[J].Artificial Intelligence Review，2022，55（5）：3503-3568.
[9] 李瑶，左兴权，王春露，等.人工智能可解释性评估研究综述[J].导航定位与授时，2022，9（6）：13-24.
LI Y，ZUO X Q，WANG C L，et al.Research progress of artificial intelligence interpretability evaluation[J].Navigation Positioning and Timing，2022，9（6）：13-24.
[10] ISLAM S R，EBERLE W，GHAFOOR S K.Towards quantification of explainability in explainable artificial intelligence methods[C]//The Thirty-third International Flairs Conference，2020.
[11] CARVALHO D V，PEREIRA E M，CARDOSO J S.Machine learning interpretability：a survey on methods and metrics[J].Electronics，2019，8（8）：832.
[12] MOHSENI S，BLOCK J E，RAGAN E.Quantitative evaluation of machine learning explanations：a human-grounded benchmark[C]//26th International Conference on Intelligent User Interfaces，2021：22-31.
[13] GUNNING D，AHA D W.DARPA’s explainable artificial intelligence program[J].AI Magazine，2019，40（2）：44-58.
[14] 可解释、可通用的下一代人工智能方法重大研究计划2022年度项目指南[J].模式识别与人工智能，2022，35（5）：481-482.
Major research program on interpretable and generalizable next-generation artificial intelligence methods 2022 annual project guide[J].Pattern Recognition and Artificial Intelligence，2022，35（5）：481-482.
[15] GUNNING D，STEFIK M，CHOI J，et al.XAI—explainable artificial intelligence[J].Science Robotics，2019，4（37）：eaay7120.
[16] DAS S，AGARWAL N，VENUGOPAL D，et al.Taxonomy and survey of interpretable machine learning method[C]//2020 IEEE Symposium Series on Computational Intelligence（SSCI），2020：670-677.
[17] DUVAL A.Explainable artificial intelligence （XAI）[R].The University of Warwick.Mathematics Institute，2019：1-53.
[18] TAN S，CARUANA R，HOOKER G，et al.Learning global additive explanations for neural nets using model distillation[J].arXiv.1801.08640，2018.
[19] MOLNAR C.Interpretable machine learning[Z].2020.
[20] FAN F L，XIONG J，LI M，et al.On interpretability of artificial neural networks：a survey[J].IEEE Transactions on Radiation and Plasma Medical Sciences，2021，5（6）：741-760.
[21] DAS A，RAD P.Opportunities and challenges in explainable artificial intelligence（XAI）：a survey[J].arXiv：2006.
11371，2020.
[22] RIBEIRO M T，SINGH S，GUESTRIN C.Model-agnostic interpretability of machine learning[J].arXiv：1606.05386，2016.
[23] FRIEDMAN J H.Greedy function approximation：a gradient boosting machine[J].Annals of Statistics，2001：1189-1232.
[24] GOLDSTEIN A，KAPELNER A，BLEICH J，et al.Peeking inside the black box：visualizing statistical learning with plots of individual conditional expectation[J].Journal of Computational and Graphical Statistics，2015，24（1）：44-65.
[25] GALKIN F，ALIPER A，PUTIN E，et al.Human microbiome aging clocks based on deep learning and tandem of permutation feature importance and accumulated local effects[J].BioRxiv，2018：507780.
[26] FRIEDMAN J H，POPESCU B E.Predictive learning via rule ensembles[J].The Annals of Applied Statistics，2008，2（3）：916-954.
[27] BREIMAN L.Random forests[J].Machine Learning，2001，45（1）：5-32.
[28] FISHER A，RUDIN C，DOMINICI F.All Models are wrong，but many are useful：learning a variable’s importance by studying an entire class of prediction models simultaneously[J].J Mach Learn Res，2019，20（177）：1-81.
[29] SUNDARARAJAN M，NAJMI A.The many shapley values for model explanation[C]//International Conference on Machine Learning，2020：9269-9278.
[30] YUAN X，HE P，ZHU Q，et al.Adversarial examples：attacks and defenses for deep learning[J].IEEE Transactions on Neural Networks and Learning Systems，2019，30（9）：2805-2824.
[31] 孔祥维，杨浩.一种基于深度神经网络模型可解释性的对抗样本防御方法：CN112364885A[P].2021.
KONG X W，YANG H.An adversarial sample defense method based on deep neural network model interpretability：CN112364885A[P].2021.
[32] 董胤蓬，苏航，朱军.面向对抗样本的深度神经网络可解释性分析[J].自动化学报，2022，48（1）：75-86.
DONG Y P，SU H，ZHU J.Interpretability analysis of deep neural networks with adversarial examples[J].Acta Automatica Sinica，2022，48（1）：75-86.
[33] NAUTA M，JUTTE A，PROVOOST J，et al.This looks like that，because... explaining prototypes for interpretable image recognition[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Cham：Springer，2021：441-456.
[34] KOH P W，LIANG P.Understanding black-box predictions via influence functions[C]//International Conference on Machine Learning，2017：1885-1894.
[35] GUO H，RAJANI N F，HASE P，et al.Fastif：scalable influence functions for efficient model interpretation and debugging[J].arXiv：2012.15781，2020.
[36] RIBEIRO M T，SINGH S，GUESTRIN C."Why should I trust you?" Explaining the predictions of any classifier[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining，2016：1135-1144.
[37] ZHANG Y，SONG K，SUN Y，et al."Why should you trust my explanation?" Understanding uncertainty in LIME explanations[J].arXiv：1904.12991，2019.
[38] RIBEIRO M T，SINGH S，GUESTRIN C.Anchors：high-precision model-agnostic explanations[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2018.
[39] ZHOU Z，HOOKER G，WANG F.S-lime：stabilized-lime for model explanation[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining，2021：2429-2438.
[40] SHANKARANARAYANA S M，RUNJE D.ALIME：autoencoder based approach for local interpretability[C]//International Conference on Intelligent Data Engineering and Automated Learning.Cham：Springer，2019：454-463.
[41] ELSHAWI R，SHERIF Y，AL-MALLAH M，et al.ILIME：local and global interpretable model-agnostic explainer of black-box decision[C]//European Conference on Advances in Databases and Information Systems.Cham：Springer，2019：53-68.
[42] HAUFE S，MEINECKE F，G?RGEN K，et al.On the interpretation of weight vectors of linear models in multivariate neuroimaging[J].Neuroimage，2014，87：96-110.
[43] BERTSIMAS D，KING A.Logistic regression：from art to science[J].Statistical Science，2017，32（3）：367-384.
[44] QUINLAN J R.Learning decision tree classifiers[J].ACM Computing Surveys，1996，28（1）：71-72.
[45] WEBB G I，KEOGH E，MIIKKULAINEN R.Na?ve Bayes[J].Encyclopedia of Machine Learning，2010，15：713-714.
[46] SHRIKUMAR A，GREENSIDE P，KUNDAJE A.Learning important features through propagating activation differences[C]//International Conference on Machine Learning，2017：3145-3153.
[47] BACH S，BINDER A，MONTAVON G，et al.On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation[J].PloS One，2015，10（7）：e0130140.
[48] WANG Z，HUANG X，YANG J，et al.Universal adversarial perturbation generated by attacking layer-wise relevance propagation[C]//2020 IEEE 10th International Conference on Intelligent Systems（IS），2020：431-436.
[49] MONTAVON G，BINDER A，LAPUSCHKIN S，et al.Layer-wise relevance propagation：an overview[J].Explainable AI：Interpreting，Explaining and Visualizing Deep Learning，2019：193-209.
[50] SIMONYAN K，VEDALDI A，ZISSERMAN A.Deep inside convolutional networks：visualising image classification models and saliency maps[J].arXiv：1312.6034，2013.
[51] ERHAN D，BENGIO Y，COURVILLE A，et al.Visualizing higher-layer features of a deep network[Z].University of Montreal，2009.
[52] ZEILER M D，FERGUS R.Visualizing and understanding convolutional networks[C]//European Conference on Computer Vision.Cham：Springer，2014：818-833.
[53] SPRINGENBERG J T，DOSOVITSKIY A，BROX T，et al.Striving for simplicity：the all convolutional net[J].arXiv：1412.6806，2014.
[54] SMILKOV D，THORAT N，KIM B，et al.Smoothgrad：removing noise by adding noise[J].arXiv：1706.03825，2017.
[55] SUNDARARAJAN M，TALY A，YAN Q.Axiomatic attribution for deep networks[C]//International Conference on Machine Learning，2017：3319-3328.
[56] ZHOU B，KHOSLA A，LAPEDRIZA A，et al.Learning deep features for discriminative localization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：2921-2929.
[57] SELVARAJU R R，COGSWELL M，DAS A，et al.Grad-CAM：visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Conference on Computer Vision，2017：618-626.
[58] CHATTOPADHAY A，SARKAR A，HOWLADER P，et al.Grad-CAM++：generalized gradient-based visual explanations for deep convolutional networks[C]//2018 IEEE Winter Conference on Applications of Computer Vision（WACV），2018：839-847.
[59] WANG H，WANG Z，DU M，et al.Score-CAM：score-weighted visual explanations for convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops，2020：24-25.
[60] PEARL J.Theoretical impediments to machine learning with seven sparks from the causal revolution[J].arXiv：1801.04016，2018.
[61] WACHTER S，MITTELSTADT B，RUSSELL C.Counterfactual explanations without opening the black box：automated decisions and the GDPR[J].arXiv：1711.00399，2017.
[62] GRATH R M，COSTABELLO L，VAN C L，et al.Interpretable credit application predictions with counterfactual explanations[J].arXiv：1811.05245，2018.
[63] MOTHILAL R K，SHARMA A，TAN C.Explaining machine learning classifiers through diverse counterfactual explanations[C]//Proceedings of the 2020 Conference on Fairness，Accountability，and Transparency，2020：607-617.
[64] RUSSELL C.Efficient search for diverse coherent explanations[C]//Proceedings of the Conference on Fairness，Accountability，and Transparency，2019：20-28.
[65] LIU S，KAILKHURA B，LOVELAND D，et al.Generative counterfactual introspection for explainable deep learning[C]//2019 IEEE Global Conference on Signal and Information Processing（GlobalSIP），2019：1-5.
[66] LOOVEREN A V，KLAISE J.Interpretable counterfactual explanations guided by prototypes[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Cham：Springer，2021：650-665.
[67] DUONG T D，LI Q，XU G.Prototype-based counterfactual explanation for causal classification[J].arXiv：2105.
00703，2021.
[68] MAHAJAN D，TAN C，SHARMA A.Preserving causal constraints in counterfactual explanations for machine learning classifiers[J].arXiv：1912.03277，2019.
[69] DOSHI-VELEZ F，KIM B.Considerations for evaluation and generalization in interpretable machine learning[M]//Explainable and interpretable models in computer vision and machine learning.Cham：Springer，2018：3-17.
[70] MOLNAR C，CASALICCHIO G，BISCHL B.Quantifying interpretability of arbitrary machine learning models through functional decomposition[J].Ulmer Informatik-Berichte，2019：41.
[71] YEH C K，HSIEH C Y，SUGGALA A，et al.On the （in） fidelity and sensitivity of explanations[C]//Advances in Neural Information Processing Systems，2019.
[72] SRINIVAS S，FLEURET F.Full-gradient representation for neural network visualization[C]//Advances in Neural Information Processing Systems，2019.
[73] MORAFFAH R，KARAMI M，GUO R，et al.Causal interpretability for machine learning-problems，methods and evaluation[J].ACM SIGKDD Explorations Newsletter，2020，22（1）：18-33.
[74] ZHANG Y，WENG Y，LUND J.Applications of explainable artificial intelligence in diagnosis and surgery[J].Diagnostics，2022，12（2）：237.
[75] EL-SAPPAGH S，ALONSO J M，ISLAM S M，et al.A multilayer multimodal detection and prediction model based on explainable artificial intelligence for Alzheimer’s disease[J].Scientific Reports，2021，11（1）：1-26.
[76] LAMY J B，SEKAR B，GUEZENNEC G，et al.Explainable artificial intelligence for breast cancer：a visual case-based reasoning approach[J].Artificial Intelligence in Medicine，2019，94：42-53.
[77] LIN K，GAO Y.Model interpretability of financial fraud detection by group SHAP[J].Expert Systems with Applications，2022，210：118354.
[78] WANG D，LIN J，CUI P，et al.A semi-supervised graph attentive network for financial fraud detection[C]//2019 IEEE International Conference on Data Mining（ICDM），2019：598-607.
[79] KIM J，CANNY J.Interpretable learning for self-driving cars by visualizing causal attention[C]//Proceedings of the IEEE International Conference on Computer Vision，2017：2942-2950.
[80] SOARES E，ANGELOV P，FILEV D，et al.Explainable density-based approach for self-driving actions classification[C]//2019 18th IEEE International Conference on Machine Learning and Applications（ICMLA），2019：469-474.
[81] AHMED F，ERMAN J，GE Z，et al.Detecting and localizing end-to-end performance degradation for cellular data services based on TCP loss ratio and round trip time[J].IEEE/ACM Transactions on Networking，2017，25（6）：3709-3722.
[82] LI Z，ZHAO N，LI M，et al.Actionable and interpretable fault localization for recurring failures in online service systems[C]//Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering，2022：996-1008.

编辑推荐 0

Metrics

阅读次数

全文

587

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	587

	来源	本网站

	次数	587
	比例	100%

摘要

936

最新录用	在线预览	正式出版

0	0	936

	来源	本网站

	次数	936
	比例	100%