计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (19): 97-109.DOI: 10.3778/j.issn.1002-8331.2312-0230

• 理论与研发 • 上一篇    下一篇

基于多目标的三支决策边界域求解与分类方法

聂斌,靳海科,杜建强,张玉超,郑学鹏,陈星鑫,苗震   

  1. 江西中医药大学 计算机学院,南昌  330004
  • 出版日期:2024-10-01 发布日期:2024-09-30

Boundary Domain Solving and Classification Method of Three-Way Decisions Based on Multi-Objective

NIE Bin, JIN Haike, DU Jianqiang, ZHANG Yuchao, ZHENG Xuepeng, CHEN Xingxin, MIAO Zhen   

  1. College of Computer Science, Jiangxi University of Chinese Medicine, Nanchang 330004, China
  • Online:2024-10-01 Published:2024-09-30

摘要: 三支决策将不确定样本划分至边界域进行延迟决策,但需基于损失函数确定阈值,以划分边界域,然而,损失函数通常需要先验知识,具有一定的主观性,因此对边界域划分能力不足。针对这种问题,构建一种多目标三支决策边界域求解方法,从而更好地划分边界域,提升分类性能。采用贝叶斯规则获取样本的条件概率;设定3个目标,包括降低边界域的不确定性、缩小边界域的大小以及减小整个决策区域的错误分类率,通过融入熵权法的TOPSIS(technique for order preference by similarity to an ideal solution)方法求取最优阈值,该方法采用熵权法计算这3个目标所占的权重,得到最优阈值,获得边界域,进行延迟决策;结合不同分类器对边界域进行分类。通过UCI数据集进行对比实验,根据分类准确率和F1值,表明该方法学习到的阈值能合理地划分边界域,建立的模型能取得更好的分类性能。

关键词: 分类不确定性, 三支决策, 边界域, 多目标, 最优阈值

Abstract: The three-way decision divides the uncertain samples into the boundary domain for delayed decision, but it needs to determine the threshold based on the loss function to divide the boundary domain. However, the loss function usually needs prior knowledge and has a certain degree of subjective, so the ability to divide the boundary domain is insufficient. First, Bayes’ rule is used to obtain the conditional probability of the sample; then, three primary goals are established, including minimizing the uncertainty of the boundary domain, constraining the size of the boundary domain, and reducing the misclassification rate of the whole decision domain. It finds the optimal threshold by incorporating the entropy weight method into TOPSIS (technique for order preference by similarity to an ideal solution) method. This method uses the entropy weight method to calculate the weights of these three goals, obtain the optimal threshold, obtain the boundary domain, and make delayed decisions; finally, a combination of different classifiers is employed to classify the boundary domain. Comparative experiments are carried out on UCI data sets. According to the classification accuracy and F1 value, it shows that the threshold learned by this method can reasonably divide the boundary domain, and the established model can achieve better classification performance.

Key words: classification uncertainty, three-way decisions, boundary domain, multi-objective, optimal threshold