Computer Engineering and Applications ›› 2025, Vol. 61 ›› Issue (19): 106-117.DOI: 10.3778/j.issn.1002-8331.2501-0317

• Theory, Research and Development • Previous Articles     Next Articles

Multi-Label Feature Selection Based on Three-Way Decisions and Neighborhood Mutual Information

XIE Jinping, QIAN Wenbin, CAI Xingxing   

  1. 1.School of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 330045, China
    2.School of Software, Jiangxi Agricultural University, Nanchang 330045, China
  • Online:2025-10-01 Published:2025-09-30

基于三支决策和邻域互信息的多标记特征选择方法

谢觐平,钱文彬,蔡星星   

  1. 1.江西农业大学 计算机与信息工程学院,南昌 330045
    2.江西农业大学 软件学院,南昌 330045

Abstract: Feature selection in multi-label learning is a critical step to enhance model performance and reduce computational complexity. However, few multi-label feature selection techniques consider the impact of weight differences among labels on the feature space, and the issue of misclassification during the partition of uncertain samples is simultaneously neglected. To address the above issues, this work presents a multi-label feature selection method based on three-way decisions and neighborhood mutual information. Firstly, neighborhood mutual information is utilized to explore the intrinsic relationships of label weights in the feature space. Besides, a label significance measure, based on the neighborhood mutual information and label correlation is designed to depict the impact of labels on the feature space. Secondly, with the differential processing of samples under multi-granularity, three-way decision theory is integrated into the multi-label feature selection to partition samples granularly and  simplify the computational complexity. Finally, an objective function combining three-way decisions and neighborhood mutual information is constructed to measure the importance of features. Experimental analysis and performance comparison on eight real-world multi-label datasets further validate the feasibility and effectiveness of the proposed algorithm.

Key words: feature selection, three-way decisions, granular computing, neighborhood mutual information, multi-label learning, label weight, label relationship

摘要: 多标记学习中的特征选择是提高模型性能和降低计算复杂性的关键步骤,现有的多标记特征选择算法未考虑标记空间下标记权重的差异对特征空间的影响,同时忽略了划分样本过程中出现不确定样本划分的误分类情况。为此,提出了一种基于三支决策和邻域互信息的多标记特征选择方法。为了探索标记权重在特征空间的关联,采用邻域互信息,设计了基于邻域互信息与标记相关性的标记重要度评价方法,用来刻画标记对特征空间的影响;在多层次下对样本进行差异化处理,将三支决策理论融入多标记特征选择过程中,对样本进行粒度划分简化计算的复杂性。结合三支决策和邻域互信息构建度量特征重要性的目标函数;通过在8个真实多标记数据集的实验分析和对比算法性能,进一步验证了该算法的可行性和有效性。

关键词: 特征选择, 三支决策, 粒计算, 邻域互信息, 多标记学习, 标记权重, 标记关系