Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (2): 115-119.DOI: 10.3778/j.issn.1002-8331.1809-0175

Previous Articles     Next Articles

Large Weight Suppression Strategy for Training Convolutional Neural Networks

FAN Chunlong, HE Yufeng, WANG Yixin   

  1. School of Computer, Shenyang Aerospace University, Shenyang 110136, China
  • Online:2020-01-15 Published:2020-01-14

大权值抑制策略用于训练卷积神经网络

范纯龙,何宇峰,王翼新   

  1. 沈阳航空航天大学 计算机学院,沈阳 110136

Abstract: Convolutional Neural Network(CNN) is an important research direction for deep learning. Because its model is complex and difficult to train, designing a better training method for CNN has always been a research hotspot. It analyzes the influence of its weights on the training results based on the trained CNN model, and confirms that the greater weights have greater impact on the network performance. The performance of the model mainly depends on the few large weights. Weight Restrain of CNN(WR-CNN) training method is proposed, which adjusts the weight updating strategy during model training and sets a suppression coefficient related to the value of the weight. The coefficient adjusts the weight increment of the back propagation to achieve the purpose of controlling the distribution of connection with a large weight. The experimental results show that the WR-CNN training method can effectively improve the training performance of CNN. Under different experimental conditions, the method reduces the error rate of the CNN model by 1.8%~5.0%, the sensitivity of the model to the connection with a lager weight is significantly reduced and the model generalization ability and robustness are improved. The WR-CNN training method can be used to re-optimize the trained CNN model.

Key words: convolutional neural network, weight parameters, suppression coefficient, generalization ability, robustness

摘要: 卷积神经网络(Convolutional Neural Network,CNN)是深度学习研究的重要方向,因其模型复杂且训练困难,所以设计更好的CNN训练方法一直是研究热点。针对训练好的CNN模型,分析了其参数权值对训练结果的影响,确认权值越大的连接参数对模型性能的影响也越大,且整个模型的性能主要由极少数的大权值参数决定。据此,提出了CNN的权值抑制训练方法(Weight Restrain of CNN,WR-CNN),该方法调整了模型训练时的权值更新策略,设置一个与权值大小相关的抑制系数,用该系数调整反向传播时的权值增量幅度,达到控制大权值连接参数分布的目的。在不同实验条件下,该方法将CNN模型的错误率降低1.8%~5.0%,模型对大权值参数的敏感性明显降低,模型泛化能力和鲁棒性均得到改善,另外,该方法也可用于对已训练好的网络模型进行再优化。

关键词: 卷积神经网络, 权值参数, 抑制系数, 泛化能力, 鲁棒性