Computer Engineering and Applications ›› 2015, Vol. 51 ›› Issue (12): 21-26.

Previous Articles     Next Articles

Incremental updating method for big data feature learning

BU Fanyu1,2, CHEN Zhikui1, ZHANG Qingchen1   

  1. 1.School of Software Technology, Dalian University of Technology, Dalian, Liaoning 116620, China
    2.College of Vocation, Inner Mongolia University of Finance and Economics, Hohhot 010010, China
  • Online:2015-06-15 Published:2015-06-30

支持增量式更新的大数据特征学习模型

卜范玉1,2,陈志奎1,张清辰1   

  1. 1.大连理工大学 软件学院,辽宁 大连 116620
    2.内蒙古财经大学 职业学院,呼和浩特 010010

Abstract: Data are generating at extremely high speed in the era of big data, whose contents and features are in the dynamic changes. Thus, the learning algorithm for neural networks should not only be able to adapt new instances, but also preserve the prior knowledge. However, the feed-forward neural network trained by typically Back-Propagation(BP) algorithm is not incremental in nature. This paper proposes an incremental back-propagation model for training neural networks. The goal of incremental leaning is achieved by adjusting the parameters and structures of the feed-forward neural network. The parameters are incrementally adapted by optimizing an objective function. The network topology is adapted by increasing the number of hidden neurons only if the parameters adaption perturbs the prior knowledge severely. After updating the model, the Singular Value Decomposition(SVD) of the weight matrix is performed to remove the redundant connections of each newly added hidden unit. Experimental results demonstrate that the proposed model can adjust its parameters and structure depending on the requirement of the big data process in real time with preserving the prior knowledge as much as possible in evolving environments.

Key words:  big data, feed-forward neural networks, incremental learning, Singular Value Decomposition(SVD)

摘要: 大数据具有高速变化特性,其内容与分布特征均处于动态变化之中,目前的前馈神经网络模型是一种静态学习模型,不支持增量式更新,难以实时学习动态变化的大数据特征。针对这个问题,提出一种支持增量式更新的大数据特征学习模型。通过设计一个优化目标函数对参数进行快速增量式更新,为了在更新过程中保持网络的原始知识,最小化平方误差函数。对于特征变化频繁的数据,通过增加隐藏层神经元数目网络对结构进行更新,使得更新后的网络能够实时学习动态变化大数据的特征。在对网络参数与结构更新之后,通过权重矩阵SVD分解对更新后的网络结构进行优化,删除冗余的网络连接,增强网络模型的泛化能力。实验结果表明提出的模型能够在尽可能保持网络模型原始知识的基础上,通过不断更新神经网络的参数与结构实时学习动态大数据的特征。

关键词: 大数据, 前馈神经网络, 增量式学习, 奇异值分解(SVD)