计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (1): 158-164.DOI: 10.3778/j.issn.1002-8331.1906-0214

• 模式识别与人工智能 • 上一篇    下一篇

基于Tensorflow对卷积神经网络的优化研究

郭敏钢,宫鹤   

  1. 1.吉林农业大学 信息技术学院,长春 130118
    2.吉林农业大学 吉林省智能环境工程研究中心,长春 130118
    3.吉林农业大学 吉林省农业物联网科技协同创新中心,长春 130118
  • 出版日期:2020-01-01 发布日期:2020-01-02

Optimization of Convolutional Neural Network Based on Tensorflow

GUO Mingang, GONG He   

  1. 1.College of Information Technology, Jilin Agricultural University, Changchun 130118, China
    2.Jilin Province Intelligent Environmental Engineering Research Center, Jilin Agricultural University, Changchun 130118, China
    3.Jilin Province Agricultural Internet of Things Science and Technology Collaborative Innovation Center, Jilin Agricultural University, Changchun 130118, China
  • Online:2020-01-01 Published:2020-01-02

摘要: 针对卷积神经网络在性耗比上的不足,提出了异构式CPU+GPU的协同计算模型,在模型计算过程中使CPU负责逻辑性强的事物处理和串行计算,使GPU执行高度线程化的并行处理任务。通过实验测试与单GPU训练、单CPU训练进行对比,结果表明异构式CPU+GPU计算模型在性耗比上更加优异。针对在卷积神经网络中Swish激活函数在反向传播求导误差梯度时涉及参数较多所导致的计算量较大,收敛速度慢,以及ReLU激活函数在[x]负区间内导数为零所导致的负梯度被置为零且神经元可能无法被激活的问题,提出了新的激活函数ReLU-Swish。通过测试训练对比并分析结果,将Swish激活函数小于零与ReLU激活函数大于零的部分组成分段函数,并且通过CIFAR-10和MNIST两个数据集进行测试对比实验。实验结果表明,ReLU-Swish激活函数在收敛速度以及模型测试训练的准确率上对比Swish激活函数及ReLU激活函数有较明显的提高。

关键词: Tensorflow, CPU+GPU, 卷积神经网络, Swish激活函数, ReLU激活函数, ReLU-Swish激活函数

Abstract: Aiming at the deficiency of the convolutional neural network in the ratio of sex consumption, a collaborative computing model of heterogeneous CPU+GPU is proposed. In the process of model calculation, the CPU is responsible for the logical processing and serial computing, so that the GPU executes highly threaded parallel processing tasks. Through experimental tests compared with single GPU training and single CPU training, the experimental results show that the heterogeneous CPU+GPU computing model is more excellent in the performance ratio.Moreover, for the Swish activation function in the convolutional neural network, the calculation of the error gradient caused by the back propagation of the error gradient is large, the convergence rate is slow, and the ReLU activation function has zero derivative in the x negative interval. The resulting negative gradient is set to zero and neurons may not be activated, and a new activation function ReLU-Swish is proposed. Through the test training comparison and analysis results, the Swish activation function is less than zero and the ReLU activation function is greater than zero to form a piecewise function, and the test comparison experiment is carried out through three data sets CIFAR-10 and MNIST. The experimental results show that the ReLU-Swish activation function has a significant improvement in the convergence speed and the accuracy of the model test training compared with the Swish activation function and the ReLU activation function.

Key words: Tensorflow, CPU+GPU, convolutional neural network, Swish activation function, ReLU activation function, ReLU-Swish activation function