Computer Engineering and Applications ›› 2025, Vol. 61 ›› Issue (22): 339-352.DOI: 10.3778/j.issn.1002-8331.2501-0275

• Engineering and Applications • Previous Articles     Next Articles

Interpretable Association Rule Defect Prediction Model Combining Counterfactuals and Multi-Objective Optimization

YU Qiao, JIANG Jiaxuan, REN Siyu, ZHU Yi   

  1. School of Computer Science and Technology, Jiangsu Normal University, Xuzhou, Jiangsu 221116, China
  • Online:2025-11-15 Published:2025-11-14

融合反事实与多目标优化的可解释关联规则缺陷预测模型

于巧,蒋佳漩,任思宇,祝义   

  1. 江苏师范大学 计算机科学与技术学院,江苏 徐州 221116

Abstract: Software defect prediction is the key to ensure software quality. In order to improve the performance of software defect prediction, researchers have designed a variety of defect prediction models, but most of the models are less transparent in providing prediction results, which makes it difficult for developers to understand the internal logic and decision-making process of the models, and thus leads to the non-interpretability problem of the models. This problem not only limits the credibility of the models, but also hinders their application in practical development. To address this problem, this paper uses multiple association rules to combine into an interpretable multi-objective optimization model, known as MoCFR, which employs a counterfactual interpretation method for feature selection, and determines the importance score of each feature by the feature change rate of the counterfactual sample. Based on this, the model applies multi-objective optimization techniques to construct an association rule classifier, while optimizing three key metrics: classification error, average number of rules, and confidence. Experimental results on the PROMISE dataset show that MoCFR outperforms existing rule-based classification models in terms of classification error and significantly reduces the number of rules compared to similar multi-objective optimization models.

Key words: software defect prediction, association rule mining, multi-objective optimization, feature selection

摘要: 软件缺陷预测是保证软件质量的关键。为了提高软件缺陷预测的性能,研究人员已经设计出多种缺陷预测模型,但大多数模型在提供预测结果时透明度较低,使得开发者难以理解模型内部的逻辑和决策过程,从而导致模型的不可解释性问题。该问题不仅限制了模型的可信度,也阻碍了其在实际发展中的应用。针对该问题,利用多个关联规则组合成一个可解释的多目标优化模型,被称为MoCFR。该模型采用反事实解释方法进行特征选择,通过反事实样本的特征变化率来确定每个特征的重要性分数。在此基础上,该模型运用多目标优化技术构建关联规则分类器,同时优化分类误差、规则平均数量和置信度三个关键指标。在PROMISE数据集上的实验结果表明,MoCFR在分类误差方面优于现有的基于规则的分类模型,与同类多目标优化模型相比,显著减少了规则数量。

关键词: 软件缺陷预测, 关联规则挖掘, 多目标优化, 特征选择