计算机工程与应用 ›› 2015, Vol. 51 ›› Issue (21): 121-127.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

随机样本遗传MLP模型算法

尤志宁,浦云明   

  1. 集美大学 计算机工程学院,福建 厦门 361021
  • 出版日期:2015-11-01 发布日期:2015-11-16

Random sample of genetic algorithm with MLP model

YOU Zhining, PU Yunming   

  1. School of Computer Engineering, Jimei University, Xiamen, Fujian 361021, China
  • Online:2015-11-01 Published:2015-11-16

摘要: 提出的SSGAMLP(Small Set Genetic Algorithm Multilayer Perceptron)模型,是针对MLP模型易陷入局部最优,且模型泛化性不好,而遗传算法可以跳出局部最优,但是种群个体数较多,却带来运算复杂度的提高,目的是为了克服以上不足,将遗传算法与MLP模型相结合,将MLP模型节点的向下连接权值看成是低层向高层的映射,因此每个节点(包括权值和阈值)可以看成是一个特征表达,即遗传算法的基因表达,同时个体MLP模型训练使用的随机样本子集以及算法的交叉变异,相当于引入随机因子,存在获得未知特征表达的可能性。实验基于MNIST数据集,印证了SSGAMLP模型在性能上的优势。模型降低了个体运算复杂度,提高了泛化性,在一定程度上克服了过拟合性。

关键词: 多层感知机, 遗传算法, 随机子集, 泛化性

Abstract: MLP model training has the problems of local optimum and its unsatisfactory generalization performance. Although the genetic algorithm can jump out of local optimum, it increases computational complexity. The purpose of this paper is to overcome the above shortcomings. This SSGAMLP(Small Set Genetic Algorithm Multilayer Perceptron) model combines the genetic algorithm and MLP model. The down connection weights of MLP model nodes are regarded as low to high level mapping, so each node(including the down connection weights and thresholds) can be regarded as the expression of a feature, namely gene expression of the genetic algorithm. Individual training using a random subset along with crossover and mutation of genetic algorithm, they are equivalent to introduced random factors, thus having the possibility of obtaining the unknown feature expression. This paper’s experiment is based on MNIST dataset. Its result confirms SSGAMLP model’s advantages. The model reduces the individual computational complexity, improves the generalization. To a certain extent, this model overcomes the over fitting.

Key words: Multilayer Perceptron(MLP), genetic algorithm, random subset, generalization performance