计算机工程与应用 ›› 2015, Vol. 51 ›› Issue (11): 109-113.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

基于分形维数和多目标遗传算法的特征选择

吴  曼,张公让,刘  恒   

  1. 1.合肥工业大学 管理学院,合肥 230009
    2.合肥工业大学 过程优化与智能决策教育部重点实验室,合肥 230009
  • 出版日期:2015-06-01 发布日期:2015-06-12

Feature selection based on fractal dimension and multi-objective genetic algorithm

WU Man, ZHANG Gongrang, LIU Heng   

  1. 1.School of?Management, Hefei University of Technology, Hefei 230009, China
    2.Key Laboratory of Process Optimization and Intelligent Decision-making, Ministry of Education, Hefei University of Technology, Hefei 230009, China
  • Online:2015-06-01 Published:2015-06-12

摘要: 在文本分类系统中,特征的优劣往往极大地影响着分类器的设计和性能。提出一种利用分形维数和带精英策略的非劣支配排序遗传算法进行特征选择的方法。在该方法中分形维数作为特征选择的一个评价机制,利用NSGA-II算法将特征子集选择问题视为多目标优化问题来处理。为了分析结果的有效性,利用SVM分类算法对复旦大学语料库进行测试。实验结果表明该方法具有较好的性能,它可以有效去除无效特征并提高分类准确性。

关键词: 分形维数, 多目标遗传算法, 特征选择

Abstract: In text categorization system, the characteristics of the advantages and disadvantages often greatly affect the design of classifier and performance. A feature subset selection algorithm is presented based on fractal dimension and with elitist strategy of fast non-dominated sorting genetic algorithm. In the algorithm, fractal dimension is used as an evaluation mechanism and NSGA-II algorithm will regard feature subset selection problem as a multi-objective optimization problem to deal with. In order to analyze the validity of the results, the SVM algorithm is utilized to test Fudan University Corpus. The experimental results show that this method has good performance, it can effectively remove the invalid character and improve classification accuracy.

Key words: fractal dimension, multi-objective genetic algorithm, feature selection