计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (8): 69-77.DOI: 10.3778/j.issn.1002-8331.2307-0232

• 理论与研发 • 上一篇    下一篇

随机多属性子空间的ReliefF加权邻域粗糙集与属性约简

王莉   

  1. 山西大同大学 计算机与网络工程学院,山西 大同 037009
  • 出版日期:2024-04-15 发布日期:2024-04-15

ReliefF Weighted Neighborhood Rough Sets and Attribute Reduction Based on Random Multi-Attribute Subspaces

WANG Li   

  1. School of Computer and Network Engineering, Shanxi Datong University, Datong, Shanxi 037009
  • Online:2024-04-15 Published:2024-04-15

摘要: 属性约简是一种重要的数据降维预处理方法,然而现有的属性约简方法大多没有考虑信息系统中属性权重的信息。ReliefF算法是一种实现简单且运算效率高的属性权重评估方法,提出一种随机多属性子空间的ReliefF加权邻域粗糙集和属性约简算法。该方法生成了多组具有相同大小随机子空间的属性集划分,并对每组划分的随机子空间利用ReliefF算法计算得到属性的局部权重,将所有组得到的属性局部权重求取平均值,得到了信息系统每个属性最终的全局权重;基于属性权重的结果,提出一种新的加权邻域粗糙集模型,并证明了相关理论和性质;在该模型的基础上通过加权邻域依赖度提出一种信息系统的属性约简算法。在公开数据集上的属性约简实验结果表明,所提出的属性约简算法比已有的同类型算法整体上具有更优的约简性能。

关键词: 属性约简, ReliefF算法, 随机子空间, 加权邻域, 邻域粗糙集模型

Abstract: Attribute reduction is an important preprocessing method for data dimensionality reduction, but most existing attribute reduction methods do not consider the information of attribute weights in information systems. The ReliefF algorithm is a simple and efficient method for evaluating attribute weights. A ReliefF weighted neighborhood rough set and attribute reduction algorithm based on random multi-attribute subspace is proposed in this paper. Firstly, this method generates multiple sets of attribute set partitions with the same size random subspaces. The local weights of attributes in each set of partitioned random subspaces are calculated using the ReliefF algorithm, and the average of the local weights of attributes obtained from all sets is calculated to obtain the final global weights of each attribute in the information system. Then, based on the results of attribute weights, a new weighted neighborhood rough set model is proposed, and the related theories and properties are proved. Finally, based on this model, an attribute reduction algorithm for information systems is proposed by weighting neighborhood dependency. The experimental results of attribute reduction on public datasets show that the proposed algorithm has better reduction performance than the existing algorithms of the same type.

Key words: attribute reduction, ReliefF algorithm, random subspace, weighted neighborhood, neighborhood rough set model