Research on application and design of GA in feature selection

doi:10.3778/j.issn.1002-8331.2010.27.036

Computer Engineering and Applications ›› 2010, Vol. 46 ›› Issue (27): 131-134.DOI: 10.3778/j.issn.1002-8331.2010.27.036

• 数据库、信号与信息处理 • Previous Articles Next Articles

Research on application and design of GA in feature selection

HE Shao-rong¹，ZHU Hao-dong^2，3

1.Department of Computer Science，Sichuan University of Science ＆ Engineering，Zigong，Sichuan 643000，China
2.College of Computer and Communication Engineering，Zhengzhou University of Light Industry，Zhengzhou 450002，China
3.Chengdu Institute of Computer Application，Chinese Academy of Sciences，Chengdu 610041，China

Received:2009-04-07 Revised:2009-06-08 Online:2010-09-21 Published:2010-09-21
Contact: HE Shao-rong

GA在特征选择中的应用与设计研究

何绍荣¹，朱颢东^2，3

1.四川理工学院计算机科学系，四川自贡 643000
2.郑州轻工业学院计算机与通信工程学院，郑州 450002
3.中国科学院成都计算机应用研究所，成都 610041

通讯作者: 何绍荣

Abstract

Abstract: It is a NP-question to choose more representative feature subset from massive Chinese data set in text categorization.With regard to the NP-question，genetic algorithm is often able to solve it effectively.In order to overcome "Drift" problem and” Early converges” problem of traditional genetic algorithm，this article firstly introduces rough sets and designs the fitness function，adaptive crossover operator，adaptive mutation operator and reasonable termination conditions.And then a feature selection algorithm is presented based on the designed genetic algorithm.Finally，the feature selection algorithm is validated by means of the corpus which is provided by Fudan University.Experiment results show that the proposed feature selection algorithm has good performance.

摘要： 从海量文本集中选择较优秀的特征子集是文本分类中的一个NP-难问题。而对于NP-问题，遗传算法往往能够有效地加以解决。为了克服传统遗传算法的“漂移”和“早敛”问题，首先引入了粗糙集并在此基础上详细设计了适应度函数、自适应交叉算子、自适应变异算子以及合理的终止条件。以此遗传算法为基础设计了一个特征选择算法。在复旦大学提供的语料库上进行了试验验证。实验结果表明此特征选择算法性能良好。

CLC Number:

TP301

HE Shao-rong¹，ZHU Hao-dong^2，3. Research on application and design of GA in feature selection[J]. Computer Engineering and Applications, 2010, 46(27): 131-134.

何绍荣¹，朱颢东^2，3. GA在特征选择中的应用与设计研究[J]. 计算机工程与应用, 2010, 46(27): 131-134.

[1]	GAO Fang1，2，HAN Pu1，ZHAI Yongjie1. Ant colony algorithm with mutation operation for continuous function optimization [J]. Computer Engineering and Applications, 2011, 47(4): 5-8.
[2]	ZHAO Dongxu，YUE Xiaobo. Optimizing parameters of fuzzy Petri net based on improved evolutionary strategy [J]. Computer Engineering and Applications, 2011, 47(4): 29-32.
[3]	SONG Xiaozhen^1，2，HAN Zhaowei¹，LI Yongming³. Algebraic properties of context-free grammar based on quantum logic [J]. Computer Engineering and Applications, 2011, 47(4): 42-46.
[4]	HUANG Minmei. Particle swarm optimization based method for logistics center location problem [J]. Computer Engineering and Applications, 2011, 47(4): 212-214.
[5]	DANG Zhengjun，DU Zhongjun. Improved algorithm combining graph-reduction and graph-search for workflow verification [J]. Computer Engineering and Applications, 2011, 47(4): 226-228.
[6]	LIU Jun^1，2，XIONG Zhongyang¹，WANG Yinhui¹. Application of improved dynamic tunneling neural network in protein secondary structure prediction [J]. Computer Engineering and Applications, 2011, 47(3): 13-16.
[7]	LIU Qiongsun，FAN Ruiya. Method of determining Gaussian kernel parameter by clustering [J]. Computer Engineering and Applications, 2011, 47(3): 38-40.
[8]	HAN Zhaowei^1，2，HAN Zhaoying³. Algebraic characterizations of context-free languages based on Lukasiewicz logic [J]. Computer Engineering and Applications, 2011, 47(3): 47-50.
[9]	DAI Haipeng，TANG Houjun. Improved algorithm for computing diameter of convex polygons [J]. Computer Engineering and Applications, 2011, 47(3): 44-46.
[10]	LI Qiang，LIU Guodong. Planning method for multiple mobile robots form team [J]. Computer Engineering and Applications, 2011, 47(2): 242-245.
[11]	CHEN Yingxian. Ant colony based on fuzzy set of spatial clustering [J]. Computer Engineering and Applications, 2011, 47(2): 5-7.
[12]	CUI Mingyi. Research on denoising mutation of FPRGA based on wavelet decomposition [J]. Computer Engineering and Applications, 2011, 47(2): 35-37.
[13]	YU Yong¹，ZHANG Ya²，GUO Xijuan²，FENG Xue². Decomposing non-convex polyhedron based on successful loop [J]. Computer Engineering and Applications, 2011, 47(2): 41-42.
[14]	WANG Bao，SUN Qin. Parallel algorithm for solving nonlinear system of equations [J]. Computer Engineering and Applications, 2011, 47(2): 49-51.
[15]	LIN Shilai，LIU Guangyuan，ZHANG Huiling. Application of ACO algorithm to emotion recognition research based on RSP signal [J]. Computer Engineering and Applications, 2011, 47(2): 169-172.

Research on application and design of GA in feature selection

GA在特征选择中的应用与设计研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics