计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (20): 142-144.DOI: 10.3778/j.issn.1002-8331.2008.20.043

• 数据库、信号与信息处理 • 上一篇    下一篇

基于分形和邻接空间密度变化的属性选择方法

杨葛钟啸,倪志伟,倪丽萍,梁敏君   

  1. 合肥工业大学 管理学院,合肥 230009
  • 收稿日期:2007-10-10 修回日期:2007-12-25 出版日期:2008-07-11 发布日期:2008-07-11
  • 通讯作者: 杨葛钟啸

Feature selection method based on fractal and changes of neighborhood space density

YANGGE Zhong-xiao,NI Zhi-wei,NI Li-ping,LIANG Min-jun   

  1. School of Management,Hefei University of Technology,Hefei 230009,China
  • Received:2007-10-10 Revised:2007-12-25 Online:2008-07-11 Published:2008-07-11
  • Contact: YANGGE Zhong-xiao

摘要: 属性选择通常作为一个主要的预处理步骤,在机器学习和数据挖掘领域有着广泛的应用。选择出能够表征数据集分形特征的属性子集,对研究数据集的分形规律具有重要的价值。根据数据集的分形特征,引入了密度分析方法,指出了当前基于分形维数的属性选择方法的不足,提出了一种基于分形和邻接空间密度变化的属性选择方法。为了分析实验结果的有效性,利用SVM分类算法和K-fold交叉验证相结合的方法对3个数据集属性选择前后的分类性能进行了测试。实验证明该方法在属性选择方面有较好的性能,能够得到较优的属性子集。

关键词: 属性选择, 分形维数, 邻接空间, 密度

Abstract: Feature selection has abroad application in machine learning and data mining area,it is always applied as a primary pre-processing step.Selecting feature space which can stand for data set’s fractal characteristics has an important value in revealing the law of data set.Basing on the future of fractal,this paper introduces the density analysis method and points out the defects of existing feature selection method based on fractal dimension.Then a feature selection method based on fractal and changes of neighborhood space density is proposed.In order to evaluate the efficiency of this algorithm,the SVM algorithm and K-fold cross validation are used to evaluate the classification accuracy on three datasets.Experimental results show that this method can achieve a good performance compared with the existing methods,and can identify the better feature space.

Key words: feature selection, fractal dimension, neighborhood space, density