Software fault localization model based on linear classification algorithm

doi:10.3778/j.issn.1002-8331.1606-0176

Abstract

Abstract: Spectrum-Based Fault Localization（SBFL） techniques aid developers to reduce the debugging effort. As a light-weight promising approach, SBFL only collects the testing result of passed or failed, and the corresponding coverage information. Based on these data, SBFL can then calculate a runtime spectra for each program statement. SBFL approaches apply various functions to map four profile features to a suspiciousness score. However, existing functions don’t give good accuracy due to the influence of the fixed parameters. Therefore, a machine learning method is proposed that can automatically construct a suspiciousness function of the specific program set. First, the old versions of a program having fault code are collected. Next, it is mapped from the feature difference in a pair of faulty statement and non-faulty statement to an instance in training dataset. Finally the linear classification algorithm of Weka is applied to learn a mapping function. The function learned from old versions is defined as the fault localization model of the program. To assess the validity of the proposed method, an experiment is performed on three benchmark datasets: Siemens suite, space and gzip. Experimental result demonstrates that the proposed method reduces fault localization cost that exists in SBFL approaches.

Key words: classification algorithm, linear model, fault localization, program spectra, software testing

摘要： 基于谱的错误定位（SBFL）方法能帮助程序员减小软件调试的困难。作为一种轻量方法，SBFL只需收集测试用例的覆盖信息和测试结果，计算程序每条语句的运行特征。众多SBFL方法，将四个运行特征组合成不同的可疑度计算公式。然而，这些公式受固定参数的影响，无法适应不同的程序集。因此，提出一种机器学习方法，能自动确定特定程序集的可疑度计算公式。首先，收集已标注错误语句的程序旧版本；再将错误语句与正确语句的运行特征两两相减，构造为训练集的一个样本；最后基于Weka的分类算法，学习到线性函数，作为该程序的错误定位模型。在Siemens程序包、space和gzip三个基准数据集上，使用Logistic、SGD、SMO和LibLinear学习到的模型，性能都要优于SBFL方法。

关键词: 分类算法, 线性模型, 错误定位, 程序谱, 软件测试

HE Haijiang. Software fault localization model based on linear classification algorithm[J]. Computer Engineering and Applications, 2017, 53(21): 42-48.

何海江. 基于线性分类算法的软件错误定位模型[J]. 计算机工程与应用, 2017, 53(21): 42-48.

[1]	TAN Lijuan, ZHENG Wei, LIU Youlin, FAN Xin, YANG Fengyu. Survey of Airborne Software Test and Verification Methods for Airworthiness Standards [J]. Computer Engineering and Applications, 2021, 57(15): 9-22.
[2]	SU Qing, LIN Huazhi, HUANG Jianfeng, LIN Zhiyi. Malicious Android Application Detection Combining CNN and Catboost Algorithm [J]. Computer Engineering and Applications, 2021, 57(15): 140-146.
[3]	WANG Junhong, GUO Yahui. Imbalanced Data Stream Classification Algorithm for Dynamic Data Chunk [J]. Computer Engineering and Applications, 2021, 57(13): 124-129.
[4]	LIU Youlin, ZHENG Wei, TAN Lijuan, FAN Xin, YANG Fengyu. Summary of Airborne Software Testing and Verification Tools for Airworthiness Standards [J]. Computer Engineering and Applications, 2021, 57(11): 1-10.
[5]	WANG Caiwen, YANG Youlong. Improved Nearest Neighbor Classification Algorithm for Imbalanced Data [J]. Computer Engineering and Applications, 2020, 56(7): 30-38.
[6]	YANG Fan1, XIE Hongwei1, LIU Aiyuan2. Lung Nodule Classification Algorithm Based on Convolutional Neural Network [J]. Computer Engineering and Applications, 2019, 55(7): 145-150.
[7]	QIU Baoxin, ZHOU Wei, CHEN Tinghai. Software Fault Localization Based on Conditioned Classification Execution Slicing Spectrum [J]. Computer Engineering and Applications, 2019, 55(19): 253-262.
[8]	CAI Pengfei, YE Jianfeng. CBIR Method Based on Improved CNN and Bilinear Model [J]. Computer Engineering and Applications, 2019, 55(16): 191-196.
[9]	WANG Zhengjie, YANG Weili, WANG Zhe, HOU Yushan, GUO Yinjing. Survey of behavior recognition based on CSI [J]. Computer Engineering and Applications, 2018, 54(5): 14-23.
[10]	WANG Ying. Software testing with uncertain requirements [J]. Computer Engineering and Applications, 2018, 54(20): 35-41.
[11]	SUN Zhao, XU Zengpu, WANG Yongqiang, ZHOU Congling. Control and compensation of perspective projection error analysis in machine vision measurement [J]. Computer Engineering and Applications, 2018, 54(2): 266-270.
[12]	XUE Meng, JIANG Shujuan, WANG Rongcun. Systematic review of test data generation based on intelligent optimization algorithm [J]. Computer Engineering and Applications, 2018, 54(17): 16-23.
[13]	LIU Hefei1, CHEN Xiaohong2, RUAN Tong1. Survival prediction of game guild based on joint models for longitudinal and survival data [J]. Computer Engineering and Applications, 2018, 54(14): 264-270.
[14]	ZHANG Xinfeng, WANG Mingyu, ZHANG Chunmei. Research on ballasted track fastener loss recognition algorithm [J]. Computer Engineering and Applications, 2018, 54(13): 137-141.
[15]	HU Suqiang1, ZHANG Limin2, YOU Zhenwei1. Nonlinear model predictive control algorithm of offline ellipsoidal set [J]. Computer Engineering and Applications, 2018, 54(13): 236-240.

Software fault localization model based on linear classification algorithm

基于线性分类算法的软件错误定位模型

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics