Computer Engineering and Applications ›› 2008, Vol. 44 ›› Issue (16): 129-131.
• 数据库、信号与信息处理 • Previous Articles Next Articles
FENG Nan1,FANG De-ying2,XIE Jing1
Received:
Revised:
Online:
Published:
Contact:
冯 楠1,方德英2,解 晶1
通讯作者:
Abstract: A method of data splitting for sample set based on Genetic Algorithms(GA) is presented in this paper.The method is applied to the process of data splitting in Data Mining(DM).The data splitting using the method maximizes the classification model accuracy and at the same time minimizes the noise percentage between the training set and the test set.Finally,the validity of the method is proved using a set of software project sample data.
Key words: Genetic Algorithms(GA), data splitting, Data Mining(DM)
摘要: 提出了一种基于遗传算法的样本集数据分割方法。数据挖掘过程中该方法能够解决如何对一个样本集进行数据分割,从而得到最佳训练集和测试集的问题。通过该方法进行数据分割,不仅提高了分类模型的分类精度,而且能够最小化训练集和测试集之间的噪声百分比。最后,以一组软件项目样本数据为例说明该方法的有效性。
关键词: 遗传算法, 数据分割, 数据挖掘
FENG Nan1,FANG De-ying2,XIE Jing1. Method of data splitting for sample set based on genetic algorithms[J]. Computer Engineering and Applications, 2008, 44(16): 129-131.
冯 楠1,方德英2,解 晶1. 一种基于遗传算法的样本集数据分割方法[J]. 计算机工程与应用, 2008, 44(16): 129-131.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/
http://cea.ceaj.org/EN/Y2008/V44/I16/129