计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (7): 158-163.DOI: 10.3778/j.issn.1002-8331.1912-0452

• 模式识别与人工智能 • 上一篇    下一篇

回环软件缺陷数量预测模型

李莉,纪欣沅,宋嵩   

  1. 东北林业大学 信息与计算机工程学院,哈尔滨 150040
  • 出版日期:2021-04-01 发布日期:2021-04-02

Prediction Model for Number of Software Defects in Loop

LI Li, JI Xinyuan, SONG Song   

  1. College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
  • Online:2021-04-01 Published:2021-04-02

摘要:

软件缺陷预测是软件工程中的一个研究热点问题,通常软件缺陷预测的研究工作主要关注于软件模块是否存在缺陷和软件模块存在缺陷的数量。目前软件缺陷数量研究主要集中在基于缺陷数的软件模块排序。为提高软件模块排序的准确度,提出一种回环软件缺陷数量预测模型。此模型主要包括回环特征选择和缺陷预测两部分。在回环特征选择部分,将改进的密度峰值聚类算法和包裹式特征选择方法相结合,以回环的方式动态的选出最优特征,并训练学习器;陷预测部分采用反距离加权集成的方式得到预测结果。实验结果表明,此模型相比于LRCR、GRCR、LR、MLP、GP、NBR、ZIP分别提升了10.36%、28.74%、13.51%、36.61%、25.30%、60.14%、54.72%,有助于提高软件缺陷预测准确性。

关键词: 软件缺陷数量预测, 密度峰值聚类, 回环特征选择, 反距离加权法, 集成学习

Abstract:

Software defect prediction is a hot research topic in software engineering. Generally, the research work of software defect prediction mainly focuses on whether there are defects in software modules and the number of defects in software modules. At present, the research of software defect quantity mainly focuses on software module sequencing based on defect quantity. In order to improve the accuracy of software module sorting, this paper proposes a prediction model of loopback software defects. This model includes two parts:loop feature selection and defect prediction. In the loop feature selection part, the improved density peak clustering algorithm and the wrapped feature selection method are combined to dynamically select the optimal feature in the loop way and train the learner. In the defect prediction part, the inverse distance weighted integration method is used to get the prediction results. The experimental results show that compared with LRCR, GRCR, LR, MLP, GP, NBR and ZIP, the model is improved by 10.36%, 28.74%, 13.51%, 36.61%, 25.30%, 60.14% and 54.72% respectively, it is helpful to improve the efficiency of software testing.

Key words: software defect quantity measurement, density peak clustering, loop feature selection, inverse distance weighting method, integrated learning