计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (36): 125-128.

• 数据库、信号与信息处理 • 上一篇    下一篇

遗传算法的粗糙集理论在文本降维上的应用

赵东红1,2,王来生2,张  峰1   

  1. 1.北京科技大学 数理学院 应用数学系,北京 100083
    2.中国农业大学 理学院 数学系,北京 100085
  • 出版日期:2012-12-21 发布日期:2012-12-21

Genetic algorithm of rough set theory in text dimension reduction applications

ZHAO Donghong1,2, WANG Laisheng2, ZHANG Feng1   

  1. 1.Department of Applied Mathematics, School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China
    2.Department of Mathematics, School of Science, China Agriculture University, Beijing 100085, China
  • Online:2012-12-21 Published:2012-12-21

摘要: 遗传算法作为一种有效的全局并行优化搜索工具,早被众多应用领域所接受。根据问题提出了相应的适应度函数,针对遗传算法和粗糙集理论两种方法各自的特点,将两种算法适当结合。还把结合后的方法和单一的粗糙集算法在文本分类效果上进行了对比。实验结果表明将遗传算法和粗糙集理论相结合的优化方法来应用到特征提取中,比单一的粗糙集算法,具有更好的降维效果,使得降维后的特征词更有利于文本数据的分类,大大优化了文本分类的效果。

关键词: 混合遗传算法, 特征降维, 文本分类

Abstract: Genetic algorithms as an effective search tool for optimizing the overall parallel, its many applications have been accepted early. This paper makes some improvements on the foundations of former genetic algorithms which are connected with related concepts and knowledge of rough set, puts forward new adapting degree function, and focuses on genetic algorithms and rough set algorithms on feature extraction and text classification. At the same time, it compares combined algorithms with single rough set algorithms. The experiment results indicate that combined use of genetic algorithms and the rough set in feature extraction, makes drop-dimensional characteristics of the words more conducive for the classification of text data, greatly optimizes the text classification results.

Key words: mixed genetic algorithm, reduce dimension of feature, text classification