Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (4): 55-60.DOI: 10.3778/j.issn.1002-8331.2002-0061

Previous Articles     Next Articles

Fast Attribute Reduction Algorithm Based on Fuzzy Hierarchical Quotient Space

DAI Qi, LI Min, LIU Yang, LI Lihong   

  1. 1.College of Science, North China University of Science and Technology, Tangshan, Hebei 063210, China
    2.Hebei Key Laboratory of Data Science and Application, Tangshan, Hebei 063210, China
    3.Tangshan Key Laboratory of Data Science, Tangshan, Hebei 063210, China
  • Online:2021-02-15 Published:2021-02-06

模糊层次商空间的快速属性约简算法

代琪,李敏,刘洋,李丽红   

  1. 1.华北理工大学 理学院,河北 唐山 063210
    2.河北省数据科学与应用重点实验室,河北 唐山 063210
    3.唐山市数据科学重点实验室,河北 唐山 063210

Abstract:

Aiming at the problems that the calculation process of the traditional attribute reduction algorithm by using the equivalent relation is cumbersome and the algorithm takes longer time when the sample set is large, this paper proposes a fast attribute reduction algorithm that uses fuzzy Euclidean distance. Firstly, the fuzzy Euclidean distance is defined to calculate the distance between attributes. Secondly, the hierarchical quotient space structure is used to construct the granular layer space. Finally, the granular space clustering result is used as the basis to reduce the sample set attributes. The simulation results show that the reduction speed of the algorithm is not limited by the number of samples in the sample set, and the operation speed is fast. It can achieve fast reduction of the data without deleting the samples. The reduction has small impact on the classification accuracy of the data set, while the classification accuracy of some data set has been improved, which provides new research ideas for large-scale data set reduction.

Key words: hierarchical quotient space, fuzzy Euclidean distance, attribute reduction

摘要:

针对传统属性约简算法利用等价关系计算过程繁琐,样本集较大时运行时间长的问题,提出一种利用模糊欧氏距离的快速属性约简算法。定义模糊欧氏距离计算属性间距离;应用层次商空间结构构建约简粒层空间;以粒层空间聚类结果作为约简基础,实现样本集属性约简。仿真结果表明,该算法约简速度不受样本集样本数量限制,运算速度较快,能够在不删除样本的情况下实现数据的快速约简,约简后对数据集分类精度影响小,部分数据集分类精度有所提升,为大规模数据集约简提供了新的研究思路。

关键词: 层次商空间, 模糊欧氏距离, 属性约简