Computer Engineering and Applications ›› 2019, Vol. 55 ›› Issue (11): 204-208.DOI: 10.3778/j.issn.1002-8331.1812-0254

Previous Articles     Next Articles

Research on Software Defect Prediction Based on Bayesian Logistic Regression

LAI Yongkai1, CHEN Xiangyu2, LIU Hai2   

  1. 1.Institute of Education, Shaoguan University, Shaoguan, Guangdong 512005, China
    2.School of Computer Science & Engineering, South China University of Technology, Guangzhou 510006, China
  • Online:2019-06-01 Published:2019-05-30

基于贝叶斯Logistic回归的软件缺陷预测研究

赖永凯1,陈向宇2,刘  海2   

  1. 1.韶关学院 教育学院,广东 韶关 512005
    2.华南理工大学 计算机科学与工程学院,广州 510006

Abstract: Software defect prediction can help project management team to optimize development and test resources in time, so as to carry out strict quality assurance activities for software modules that may contain defects, which plays an important role in the high-quality delivery of software. Therefore, software defect prediction has become a research hot topic in the field of software engineering. Though defect prediction models have been built using several machine learning algorithms, Bayesian approach of these models is not explored. Bayesian Logistic regression method with non-informative and informative priors is proposed to build defect prediction models. The advantages of Bayesian Logistic regression and the role of priors in the performance of Bayesian logistic regression are studied. Finally, compared with other existing defect prediction methods(LR, NB, RF, SVM) on PROMISE dataset, the results show that Bayesian Logistic regression method can achieve good prediction performance.

Key words: software defect prediction, Bayesian Logistic regression, informative priors

摘要: 在软件开发初期及时识别出软件存在的缺陷,可以帮助项目管理团队及时优化开发测试资源分配,以便对可能含有缺陷的软件进行严格的质量保证活动,这对于软件的高质量交付有着重要的作用,因此,软件缺陷预测成为软件工程领域内一个研究热点。虽然人们已经使用多种机器学习算法建立了缺陷预测模型,但还没有对这些模型的贝叶斯方法进行研究。提出了无信息先验和信息先验的贝叶斯Logistic回归方法来建立缺陷预测模型,并对贝叶斯Logistic回归的优势以及先验信息在贝叶斯Logistic回归中的作用进行了研究。最后,在PROMISE数据集上与其他已有缺陷预测方法(LR、NB、RF、SVM)进行了比较研究,结果表明:贝叶斯Logistic回归方法可以取得很好的预测性能。

关键词: 缺陷预测, 贝叶斯Logistic回归, 信息先验