基于深度学习的软件自动修复方法的修复偏好研究

doi:10.3778/j.issn.1002-8331.2303-0147

摘要/Abstract

摘要： 基于机器学习的软件修复方法可以降低软件缺陷修复成本，无须人工干涉而自动修复软件缺陷，但不同的缺陷修复软件对不同类型缺陷的修复偏好不明确，且缺乏针对性而无法充分发挥深度学习模型的作用；为此在研究缺陷分类的基础上，研究几种具有代表性基于深度学习的软件自动修复方法对不同类型的缺陷总的修复概率，并比较分析不同学习模型对于修复不同类型缺陷的修复偏好，后续可以更好地进行模型选择以及软件自动修复工作。实验结果表明，基于深度学习的软件自动修复方法倾向于修复IF语句类型、方法语句类型、return语句类型的缺陷。基于自编码器的软件自动修复方法倾向于修复IF语句类型的缺陷，基于LSTM的编码器-解码器的修复方法倾向于修复与方法语句类型相关的缺陷，而基于CNNs的编码器-解码器的修复方法则对IF语句类型、方法语句类型以及return语句类型这三种类型缺陷的修复偏好相差不大。

关键词: 深度学习, 软件自动修复, 缺陷分类, 修复偏好

Abstract: The automatic software repair method based on machine learning can reduce the software defect repair costs and automatically repair software defects without manual intervention. However, different defect repair software’s repair preferences for different types of defects are not clear and lack pertinence, which can not fully play the role of the deep learning model. Therefore, based on the research of defect classification, the total repair probability of several representative software automatic repair methods based on deep learning for different types of defects is studied, and the repair preferences of different learning models for repairing different types of defects are compared and analyzed, so that model selection and software automatic repair can be better carried out later. Experimental results show that software automatic repair methods based on deep learning tend to repair defects in IF statement types, method statement types, and return statement types. Software automatic repair methods based on autoencoder tend to repair defects in IF sentence types, repair methods based on LSTM encoder-decoder tends to repair defects related to method statement types, while repair methods based on CNNs encoder-decoder have similar preferences for repairing IF sentence types, method sentence types, and return sentence types.

Key words: deep learning, automatic software repair, defect classification, repair preferences

姜元鹏, 黄颖, 姜淑娟. 基于深度学习的软件自动修复方法的修复偏好研究[J]. 计算机工程与应用, 2023, 59(19): 266-273.

JIANG Yuanpeng, HUANG Ying, JIANG Shujuan. Research on Repair Preference of Software Automatic Repair Method Based on Deep Learning[J]. Computer Engineering and Applications, 2023, 59(19): 266-273.

参考文献

[1] PRESSMAN R S.Software engineering：a practitioner’s approach[M].7th ed.New York：McGraw Hill，2010：437-443.
[2] GAZZOLA L，MICUCCI D，MARIANI L.Automatic software repair：a survey[J].IEEE Transactions on Software Engineering，2019，45（1）：34-67.
[3] 李斌，贺也平，马恒太.程序自动修复：关键问题及技术[J].软件学报，2019，30（2）：244-265.
LI B，HE Y P，MA H T.Automatic program repair：key prob-lems and technologies[J].Journal of Software，2019，30（2）：244-265.
[4] 姜佳君，陈俊洁，熊英飞.软件缺陷自动修复技术综述[J].软件学报，2021，32（9）：2665-2690.
JIANG J J，CHEN J J，XIONG Y F.Survey of automatic program repair techniques[J].Journal of Software，2021，32（9）：2665-2690.
[5] 曹鹤玲，刘昱，赵晨阳，等.程序缺陷自动修复研究进展及关键问题[J].小型微型计算机系统，2022，43（3）：644-654.
CAO H L，LIU Y，ZHAO C Y，et al.Research progress and key issues of automatic program repair[J].Journal of Chinese Computer Systems，2022，43（3）：644-654.
[6] DONG Y，TANG D，CHENG X，et al.Quality evaluation method of automatic software repair using syntax distance metrics[J].Symmetry，2022；14（8）：1751.
[7] GHANBAR A，BENTON S，ZHANG L.Practical program repair via bytecode mutation[C]//Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis，2019：19-30.
[8] WEN M，LIU Y，CHEUNG S C.Boosting automated program repair with bug-inducing commits[C]//Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering：New Ideas and Emerging Results（ICSE-NIER 2020），2020：77-80.
[9] YU W，GAO F，WANG L，et al.Automatic detection，valida tion and repair of race conditions in interrupt-driven embedded software[J].IEEE Transactions on Software Engineering，2022，48（1）：346-363.
[10] TARLOW D，MOITRA S，RICE A，et al.Learning to fix build errors with graph2diff neural networks[C]//Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops，2020：19-20.
[11] HAJIPOUR H，BHATTACHARYA A，FRITZ M.SampleFix：learning to correct programs by sampling diverse fixes[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases，2021：119-133.
[12] LI Y.Improving bug detection and fixing via code repre-sentation learning[C]//Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering：Companion Proceedings，2020：137-139.
[13] JUST R，JALALI D，ERNST M D.Defects4J：a database of existing faults to enable controlled testing studies for Java programs[C]//Proceedings of the 2014 International Symposium on Software Testing and Analysis，2014：437-440.
[14] WHITE M，TUFANO M，MARTINEZ M，et al.Sorting and tranforming program repair ingredients via deep learning code similarities[C]//Proceedings of the IEEE 26th International Conference on Software Analysis，Evolution and Reengineering（SANER），2019：479-490.
[15] CHEN Z，KOMMRUSCH S，TUFANO M，et al.SequenceR：sequence-to-sequence learning for end-to-end program repair[J].IEEE Transactions on Software Engineering，2019，47（9）：1943-1959.
[16] CHAKRABORTY S，DING Y，ALLAMANIS M，et al.CODIT：code editing with tree-based neural models[J].arXiv：1810.00314，2018.
[17] LI Y，WANG S，NGUYEN T N.DLfix：context-based code transformation learning for automated program repair[C]//Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering，2020：602-614.
[18] LUTELLIER T，PHAM H V，PANG L，et al.CoCoNuT：combining context-aware neural translation models using ensemble for program repair[C]//Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis，2020：101-114.
[19] PAN K，KIM S，WHITEHEAD E J.Toward an understanding of bug fix patterns[J].Empirical Software Engineering，2009，14（3）：286-315.
[20] LIU K，KOYUNCU A，KIM D，et al.TBar：revisiting tem-plate-based automated program repair[C]//Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis，2019：31-42.