Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (6): 27-42.DOI: 10.3778/j.issn.1002-8331.2307-0382
• Research Hotspots and Reviews • Previous Articles Next Articles
YAN Hong, YANG Fengyu, ZHONG Yihui, XIONG Yu, CHEN Yu’an
Online:
2024-03-15
Published:
2024-03-15
严荭,杨丰玉,钟依慧,熊宇,陈雨安
YAN Hong, YANG Fengyu, ZHONG Yihui, XIONG Yu, CHEN Yu’an. Survey on Test Input Selection and Metrics for Deep Neural Networks[J]. Computer Engineering and Applications, 2024, 60(6): 27-42.
严荭, 杨丰玉, 钟依慧, 熊宇, 陈雨安. 深度神经网络的测试输入选择与度量标准研究综述[J]. 计算机工程与应用, 2024, 60(6): 27-42.
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2307-0382
[1] LU H, WANG L, YE M, et al. DNN-based image classification for software gui testing[C]//IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation, 2018: 1818-1823. [2] FENG J. Technical change and development trend of automatic driving[C]//Proceedings of the 2nd International Conference on Computing and Data Science (CDS), 2021: 319-324. [3] DENG L, HINTON G, KINGSBURY B. New types of deep neural network learning for speech recognition and related applications: an overview[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2013: 8599-8603. [4] HASSAN M D, NASRET A N, BAKER M R, et al. Enhancement automatic speech recognition by deep neural networks[J]. Periodicals of Engineering and Natural Sciences, 2021, 9(4): 921-927. [5] HIRANO H, MINAGI A, TAKEMOTO K. Universal adversarial attacks on deep neural networks for medical image classification[J]. BMC Medical Imaging, 2021, 21(1): 1-13. [6] EYKHOLT K, EVTIMOV I, FERNANDES E, et al. Robust physical-world attacks on deep learning visual classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 1625-1634. [7] MASUDA S, ONO K, YASUE T, et al. A survey of software quality for machine learning applications[C]//Proceedings of the IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW), 2018: 279-284. [8] RICCIO V, JAHANGIROVA G, STOCCO A, et al. Testing machine learning based systems: a systematic mapping[J]. Empirical Software Engineering, 2020, 25: 5193-5254. [9] 李舵, 董超群, 司品超, 等.神经网络验证和测试技术研究综述[J].计算机工程与应用, 2021, 57(22): 53-67. LI D, DONG C Q, SI P C, et al. Survey of research on neural network verification and testing technology[J]. Computer Engineering and Applications, 2021, 57(22): 53-67. [10] 王赞, 闫明, 刘爽, 等. 深度神经网络测试研究综述[J]. 软件学报, 2020, 31(5): 1255-1275. WANG Z, YAN M, LIU S, et al.Survey on testing of deep neural networks[J]. Journal of Software, 2020, 31(5): 1255-1275. [11] HUANG X, KROENING D, RUAN W, et al. A survey of safety and trustworthiness of deep neural networks: verification, testing, adversarial attack and defence, and interpretability[J]. Computer Science Review, 2020, 37: 100270. [12] ZHANG J M, HARMAN M, MA L, et al. Machine learning testing: survey, landscapes and horizons[J]. IEEE Transactions on Software Engineering, 2020, 48(1): 1-36. [13] WU T, DONG Y, DONG Z, et al. Testing artificial intelligence system towards safety and robustness: state of the art[J]. IAENG International Journal of Computer Science, 2020, 47(3): 449-462. [14] BERTOLINO A. Software testing research: achievements, challenges, dreams[C]//Proceedings of the Future of Software Engineering (FOSE’07), 2007: 85-103. [15] PEI K, CAO Y, YANG J, et al. Deepxplore: automated whitebox testing of deep learning systems[C]//Proceedings of the 26th Symposium on Operating Systems Principles, 2017: 1-18. [16] MA L, JUEFEI-XU F, ZHANG F, et al. Deepgauge: multi-granularity testing criteria for deep learning systems[C]//Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018: 120-131. [17] SUN Y, HUANG X, KROENING D, et al. Testing deep neural networks[J]. arXiv:1803.04792, 2018. [18] LING X, JI S, ZOU J, et al. Deepsec: a uniform platform for security analysis of deep learning model[C]//Proceedings of the IEEE Symposium on Security and Privacy, 2019: 673-690. [19] GUO C, PLEISS G, SUN Y, et al. On calibration of modern neural networks[C]//Proceedings of the International Conference on Machine Learning, 2017: 1321-1330. [20] ZHOU M, PATEL V M. Enhancing adversarial robustness for deep metric learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 15325-15334. [21] CARLINI N, ATHALYE A, PAPERNOT N, et al. On evaluating adversarial robustness[J]. arXiv:1902.06705, 2019. [22] TANAY T, GRIFFIN L. A boundary tilting persepective on the phenomenon of adversarial examples[J]. arXiv:1608. 07690, 2016. [23] WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4): 600-612. [24] WANG Z, SIMONCELLI E P, BOVIK A C. Multiscale structural similarity for image quality assessment[C]//Proceedings of the Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003: 1398-1402. [25] ZHANG R, ISOLA P, EFROS A A, et al. The unreason-able effectiveness of deep features as a perceptual metric[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 586-595. [26] GOWAL S, QIN C, UESATO J, et al. Uncovering the limits of adversarial training against norm-bounded adversarial examples[J]. arXiv:2010.03593, 2020. [27] LUO B, LIU Y, WEI L, et al. Towards imperceptible and robust adversarial example attacks against neural networks[J]. arXiv:1801.04693, 2018. [28] MITTAL A, SOUNDARARAJAN R, BOVIK A C. Making a “completely blind” image quality analyzer[J]. IEEE Signal Processing Letters, 2012, 20(3): 209-212. [29] AGHABABAEYAN Z, ABDELLATIF M, BRIAND L, et al. Black-box testing of deep neural networks through test case diversity[J]. arXiv:2112.12591, 2021. [30] KOLMOGOROV A N. Three approaches to the quantitative definition of information[J]. International Journal of Computer Mathematics, 1968, 2(1/4): 157-168. [31] BENNETT C H, GáCS P, LI M, et al. Information distance[J]. IEEE Transactions on Information Theory, 1998, 44(4): 1407-1423. [32] MANI S, SANKARAN A, TAMILSELVAM S, et al. Coverage testing of deep learning models using dataset characterization[J]. arXiv:1911.07309, 2019. [33] SHI Y, YIN B, ZHENG Z, et al. An empirical study on test case prioritization metrics for deep neural networks[C]//Proceedings of the 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS), 2021: 157-166. [34] GRENDáR JR M, GRENDáR M. Maximum probability and maximum entropy methods: Bayesian interpretation[C]//Proceedings of the American Institute of Physics Conference, 2004: 490-494. [35] SHANNON C E. A mathematical theory of communication[J]. ACM Sigmobile Mobile Computing and Communications Review, 2001, 5(1): 3-55. [36] NEAL R M. Bayesian learning for neural networks[J]. IEEE Transactions on Neural Networks, 1997, 8(2): 456. [37] GAL Y, GHAHRAMANI Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning[C]//Proceedings of the International Conference on Machine Learning, 2016: 1050-1059. [38] VAN AMERSFOORT J, SMITH L, TEH Y W, et al. Uncertainty estimation using a single deep deterministic neural network[C]//Proceedings of the International Conference on Machine Learning, 2020: 9690-9700. [39] FENG Y, SHI Q, GAO X, et al. DeepGini: prioritizing massive tests to enhance the robustness of deep neural networks[C]//Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), 2020: 177-188. [40] XU Y, ZHANG Z, ZHOU Y, et al. DeepMnist: a method of white box testing based on hierarchy[C]//Proceedings of the IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C), 2021: 712-717. [41] ZHOU Z, DOU W, LIU J, et al. Deepcon: contribution coverage testing for deep learning systems[C]//Proceedings of the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), 2021: 189-200. [42] SHEN W, WAN J, CHEN Z. MuNN: mutation analysis of neural networks[C]//Proceedings of the IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C), 2018: 108-115. [43] MA L, ZHANG F, SUN J, et al. Deepmutation: mutation testing of deep learning systems[C]//Proceedings of the IEEE 29th International Symposium on Software Reliability Engineering (ISSRE), 2018. [44] KLAMPFL L, CHETOUANE N, WOTAWA F. Mutation testing for artificial neural networks: an empirical evaluation[C]//Proceedings of the IEEE 20th International Conference on Software Quality, Reliability and Security (QRS), 2020. [45] HUMBATOVA N, JAHANGIROVA G, TONELLA P. Deepcrime: mutation testing of deep learning systems based on real faults[C]//Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), 2021. [46] RICCIO V, HUMBATOVA N, JAHANGIROVA G, et al. Deepmetis: augmenting a deep learning test set to increase its mutation score[C]//Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2021. [47] 刘佳洛, 姚奕, 黄松, 等.机器学习图像分类程序的蜕变测试框架[J].计算机工程与应用, 2020, 56(17): 69-77. LIU J L, YAO Y, HUANG S, et al. Metamorphic testing framework for machine learning image classification program[J]. Computer Engineering and Applications, 2020, 56(17): 69-77. [48] LI Z, PAN M, ZHANG T, et al. Testing dnn-based autonomous driving systems under critical environmental conditions[C]//Proceedings of the International Conference on Machine Learning, 2021: 6471-6482. [49] TIAN Y, PEI K, JANA S, et al. Deeptest: automated testing of deep-neural-network-driven autonomous cars[C]//Proceedings of the 40th International Conference on Software Engineering (ICSE), 2018: 303-314. [50] ZHANG M, ZHANG Y, ZHANG L, et al. Deeproad: GAN-based metamorphic testing and input validation framework for autonomous driving systems[C]//Proceedings of the 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), 2018: 132-142. [51] YUAN Y, PANG Q, WANG S. Unveiling hidden DNN defects with decision-based metamorphic testing[C]//Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022: 1-13. [52] 代贺鹏, 孙昌爱, 金慧, 等.面向深度学习系统的模糊测试技术研究进展[J].软件学报, 2023(11): 5008-5028. DAI H P, SUN C A, JIN H, et al. State-of-the-art survey on fuzz testing for deep learning system[J]. Journal of Software, 2023(11): 5008-5028. [53] SMYS S, CHEN J I Z, SHAKYA S. Survey on neural network architectures with deep learning[J]. Journal of Soft Computing Paradigm (JSCP), 2020, 2(3): 186-194. [54] HUANG X, KWIATKOWSKA M, WANG S, et al. Safety verification of deep neural networks[C]//Proceedings of the 29th International Conference on Computer Aided Verification, 2017: 3-29. [55] HAINS G, JAKOBSSON A, KHMELEVSKY Y. Towards formal methods and software engineering for deep learning: security, safety and productivity for dl systems development[C]//Proceedings of the Annual IEEE International Systems Conference (Syscon), 2018: 1-5. [56] WICKER M, HUANG X, KWIATKOWSKA M. Feature-guided black-box safety testing of deep neural networks[C]//Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Cham: Springer, 2018: 408-426. [57] YANG Z, SHI J, ASYROFI M H, et al. Revisiting neuron coverage metrics and quality of deep neural networks[J]. arXiv:2201.00191, 2022. [58] LIU Z, FENG Y, YIN Y, et al. Deepstate: selecting test suites to enhance the robustness of recurrent neural networks[C]//Proceedings of the 44th International Conference on Software Engineering (ICSE), 2022: 598-609. [59] GUO J, JIANG Y, ZHAO Y, et al. DLFuzz: differential fuzzing testing of deep learning systems[C]//Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC-FSE), 2018: 739-743. [60] XIE X, MA L, JUEFEI-XU F, et al. Deephunter: a coverage-guided fuzz testing framework for deep neural networks[C]//Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), 2019: 146-157. [61] DEMIR S, ENISER H F, SEN A. DeepSmartFuzzer: reward guided test generation for deep learning[J]. arXiv:1911. 10621, 2019. [62] ODENA A, OLSSON C, ANDERSEN D, et al. Tensorfuzz: debugging neural networks with coverage-guided fuzzing[C]//Proceedings of the International Conference on Machine Learning, 2019: 4901-4911. [63] CHASLOT G M J B, WINANDS M H M, HERIK H J, et al. Progressive strategies for monte-carlo tree search[J]. New Mathematics and Natural Computation, 2008, 4(3): 343-357. [64] ZHANG P, REN B, DONG H, et al. CAGFuzz: coverage-guided adversarial generative fuzzing testing for image-based deep learning systems[J]. IEEE Transactions on Software Engineering, 2021, 48(11): 4630-4646. [65] TAO C, TAO Y, GUO H, et al. DLRegion: coverage-guided fuzz testing of deep neural networks with region-based neuron selection strategies[J]. Information and Software Technology, 2023, 162: 1-13. [66] LI Z, MA X, XU C, et al. Boosting operational DNN testing efficiency through conditioning[C]//Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC-FSE), 2019: 499-509. [67] HU Q, GUO Y, CORDY M, et al. An empirical study on data distribution-aware test selection for deep learning enhancement[J]. ACM Transactions on Software Engineering and Methodology (TOSEM), 2022, 31(4): 1-30. [68] HU Q, GUO Y, XIE X, et al. Aries: efficient testing of deep neural networks via labeling-free accuracy estimation[C]//Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), 2023: 1776-1787. [69] LI Y, PEI H, HUANG L, et al. A distance-based dynamic random testing strategy for natural language processing dnn models[C]//Proceedings of the 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), 2022: 842-853. [70] 张娜, 徐海霞, 包晓安, 等.一种动态约简的多目标测试用例优先级排序方法[J].计算机科学, 2019, 46(12): 208-212. ZHANG N, XU H X, BAO X A, et al. Multi-objective test case prioritization method combined with dynamic reduction[J]. Computer Science, 2019, 46(12): 208-212. [71] AGHABABAEYAN Z, ABDELLATIF M, DADKHAH M, et al. DeepGD: a multi-objective black-box test selection approach for deep neural networks[J]. arXiv:2303.04878, 2023. [72] HAO Y, HUANG Z, GUO H, et al. Test input selection for deep neural network enhancement based on multiple-objective optimization[C]//Proceedings of the 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), 2023: 534-545. [73] DEB K, PRATAP A, AGARWAL S, et al. A fast and elitist multiobjective genetic algorithm: NSGA-II [J]. IEEE Transactions on Evolutionary Computation, 2002, 6(2): 182-197. [74] GUO Y, HU Q, CORDY M, et al. DRE: density-based data selection with entropy for adversarial-robust deep learning models[J]. Neural Computing and Applications, 2023, 35(5): 4009-4026. [75] BAO S, SHA C, CHEN B, et al. In defense of simple techniques for neural network test case selection[C]//Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023: 501-513. [76] KIM J, FELDT R, YOO S. Guiding deep learning system testing using surprise adequacy[C]//Proceedings of the IEEE/ACM 41st International Conference on Software Engineering (ICSE), 2019: 1039-1049. [77] WEISS M, CHAKRABORTY R, TONELLA P. A review and refinement of surprise adequacy[C]//Proceedings of the IEEE/ACM Third International Workshop on Deep Learning for Testing and Testing for Deep Learning (DeepTest), 2021: 17-24. [78] ZHOU J, LI F, DONG J, et al. Cost-effective testing of a deep learning model through input reduction[C]//Proceedings of the IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), 2020: 289-300. [79] CHEN J, WU Z, WANG Z, et al. Practical accuracy estimation for efficient deep neural network testing[J]. ACM Transactions on Software Engineering and Methodology (TOSEM), 2020, 29(4): 1-35. [80] CHEN Y, WANG Z, WANG D, et al. Behavior pattern-driven test case selection for deep neural networks[C]//Proceedings of the 2019 IEEE International Conference on Artificial Intelligence Testing (AITest), 2019: 89-90. [81] LI Z, ZHANG L, YAN J, et al. Peacepact: prioritizing examples to accelerate perturbation-based adversary generation for DNN classification testing[C]//Proceedings of the 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS), 2020: 406-413. [82] ZHANG K, ZHANG Y, ZHANG L, et al. Neuron activation frequency based test case prioritization[C]//Proceedings of the 2020 International Symposium on Theoretical Aspects of Software Engineering (TASE), 2020: 81-88. [83] ZHAO C, MU Y, CHEN X, et al. Can test input selection methods for deep neural network guarantee test diversity? a large-scale empirical study[J]. Information and Software Technology, 2022, 150: 106982. [84] BYUN T, SHARMA V, VIJAYAKUMAR A, et al. Input prioritization for testing neural networks[C]//Proceedings of the 2019 IEEE International Conference on Artificial Intelligence Testing (AITest), 2019: 63-70. [85] WEISS M, TONELLA P. Simple techniques work surprisingly well for neural network test prioritization and active learning (replicability study)[J]. arXiv:2205.00664, 2022. [86] WANG H, XU J, XU C, et al. Dissector: input validation for deep learning applications by crossing-layer dissection[C]//Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE), 2020: 727-738. [87] PAN Z, ZHOU S, WANG J, et al. Test case prioritization for deep neural networks[C]//Proceedings of the 2022 9th International Conference on Dependable Systems and Their Applications (DSA), 2022: 624-628. [88] ZHANG L, SUN X, LI Y, et al. A noise sensitivity analysis based test prioritization technique for deep neural networks[J]. arXiv:1901.00054, 2019. [89] GAO X, FENG Y, YIN Y, et al. Adaptive test selection for deep neural networks[C]//Proceedings of the IEEE/ACM 44th International Conference on Software Engineering (ICSE), 2022: 73-85. [90] SHEN W, LI Y, CHEN L, et al. Multiple-boundary clustering and prioritization to promote neural network retraining[C]//Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2020: 410-422. [91] AL-QADASI H, WU C, FALCONE Y, et al. Deepabstraction: 2-level prioritization for unlabeled test inputs in deep neural networks[C]//Proceedings of the 2022 IEEE International Conference on Artificial Intelligence Testing (AITest), 2022: 64-71. [92] YAN R, CHEN Y, GAO H, et al. Test case prioritization with neuron valuation based pattern[J]. Science of Computer Programming, 2022, 215: 102761. [93] TAO Y, TAO C, GUO H, et al. TPFL: test input prioritization for deep neural networks based on fault localization[C]//Proceedings of the International Conference on Advanced Data Mining and Applications. Cham: Springer Nature, 2022: 368-383. [94] CHEN J, GE J, ZHENG H. ActGraph: prioritization of test cases based on deep neural network activation graph[J]. arXiv:2211.00273, 2022. [95] LI Y, LI M, LAI Q, et al. TestRank: bringing order into unlabeled test instances for deep learning tasks[C]//Advances in Neural Information Processing Systems, 2021: 20874-20886. [96] ZHENG H, CHEN J, JIN H. Certpri: certifiable prioritization for deep neural networks via movement cost in feature space[J]. arXiv:2307.09375, 2023. [97] WANG Z, YOU H, CHEN J, et al. Prioritizing test inputs for deep neural networks via mutation analysis[C]//Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), 2021: 397-409. [98] DANG X, LI Y, PAPADAKIS M, et al. GraphPrior: mutation-based test input prioritization for graph neural networks[J]. ACM Transactions on Software Engineering and Methodology, 2023, 33(1): 1-40. [99] WEI Z, WANG H, ASHRAF I, et al. Predictive mutation analysis of test case prioritization for deep neural networks[C]//Proceedings of the 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), 2022: 682-693. [100] ZOHDINASAB T, RICCIO V, GAMBI A, et al. Efficient and effective feature space exploration for testing deep learning systems[J]. ACM Transactions on Software Engineering and Methodology, 2023, 32(2): 1-38. [101] HAQ F U, SHIN D, BRIAND L. Efficient online testing for dnn-enabled systems using surrogate-assisted and many-objective optimization[C]//Proceedings of the 44th International Conference on Software Engineering (ICSE), 2022: 811-822. [102] KIM J, KWON M, YOO S. Generating test input with deep reinforcement learning[C]//Proceedings of the 11th International Workshop on Search-Based Software Testing, 2018: 51-58. |
No related articles found! |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||