大权值抑制策略用于训练卷积神经网络

doi:10.3778/j.issn.1002-8331.1809-0175

计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (2): 115-119.DOI: 10.3778/j.issn.1002-8331.1809-0175

大权值抑制策略用于训练卷积神经网络

范纯龙，何宇峰，王翼新

沈阳航空航天大学计算机学院，沈阳 110136

出版日期:2020-01-15 发布日期:2020-01-14

Large Weight Suppression Strategy for Training Convolutional Neural Networks

FAN Chunlong, HE Yufeng, WANG Yixin

School of Computer, Shenyang Aerospace University, Shenyang 110136, China

Online:2020-01-15 Published:2020-01-14

摘要/Abstract

摘要： 卷积神经网络（Convolutional Neural Network，CNN）是深度学习研究的重要方向，因其模型复杂且训练困难，所以设计更好的CNN训练方法一直是研究热点。针对训练好的CNN模型，分析了其参数权值对训练结果的影响，确认权值越大的连接参数对模型性能的影响也越大，且整个模型的性能主要由极少数的大权值参数决定。据此，提出了CNN的权值抑制训练方法（Weight Restrain of CNN，WR-CNN），该方法调整了模型训练时的权值更新策略，设置一个与权值大小相关的抑制系数，用该系数调整反向传播时的权值增量幅度，达到控制大权值连接参数分布的目的。在不同实验条件下，该方法将CNN模型的错误率降低1.8%~5.0%，模型对大权值参数的敏感性明显降低，模型泛化能力和鲁棒性均得到改善，另外，该方法也可用于对已训练好的网络模型进行再优化。

关键词: 卷积神经网络, 权值参数, 抑制系数, 泛化能力, 鲁棒性

Abstract: Convolutional Neural Network（CNN） is an important research direction for deep learning. Because its model is complex and difficult to train, designing a better training method for CNN has always been a research hotspot. It analyzes the influence of its weights on the training results based on the trained CNN model, and confirms that the greater weights have greater impact on the network performance. The performance of the model mainly depends on the few large weights. Weight Restrain of CNN（WR-CNN） training method is proposed, which adjusts the weight updating strategy during model training and sets a suppression coefficient related to the value of the weight. The coefficient adjusts the weight increment of the back propagation to achieve the purpose of controlling the distribution of connection with a large weight. The experimental results show that the WR-CNN training method can effectively improve the training performance of CNN. Under different experimental conditions, the method reduces the error rate of the CNN model by 1.8%~5.0%, the sensitivity of the model to the connection with a lager weight is significantly reduced and the model generalization ability and robustness are improved. The WR-CNN training method can be used to re-optimize the trained CNN model.

Key words: convolutional neural network, weight parameters, suppression coefficient, generalization ability, robustness

范纯龙，何宇峰，王翼新. 大权值抑制策略用于训练卷积神经网络[J]. 计算机工程与应用, 2020, 56(2): 115-119.

FAN Chunlong, HE Yufeng, WANG Yixin. Large Weight Suppression Strategy for Training Convolutional Neural Networks[J]. Computer Engineering and Applications, 2020, 56(2): 115-119.

[1]	牟清萍，张莹，张东波，王新杰，杨知桥. 目标丢失判别机制的视觉跟踪算法及应用研究[J]. 计算机工程与应用, 2021, 57(9): 140-147.
[2]	包志强，邢瑜，吕少卿，黄琼丹. 改进YOLO V2的6D目标姿态估计算法[J]. 计算机工程与应用, 2021, 57(9): 148-153.
[3]	赵志焱，杨华，胡志伟，宇海萍. 基于TACNN的玉露香梨叶虫害识别[J]. 计算机工程与应用, 2021, 57(9): 176-181.
[4]	周伦钢，孙怡峰，王坤，吴疆，黄维贵，李炳龙. 目标多种多值属性的端端快速识别网络[J]. 计算机工程与应用, 2021, 57(9): 182-190.
[5]	张成，戴俊峰，熊闻心. 融合LeNet-5改进的扫描文档手写日期识别[J]. 计算机工程与应用, 2021, 57(9): 207-211.
[6]	麻哲旭，杨峰，乔旭. 铁路路基病害智能检测方法[J]. 计算机工程与应用, 2021, 57(9): 272-278.
[7]	冉蓉，徐兴华，邱少华，崔小鹏，欧阳斌. 基于深度卷积神经网络的裂纹检测方法综述[J]. 计算机工程与应用, 2021, 57(9): 23-35.
[8]	张越，黄友锐，刘鹏坤. 引入注意力机制的多分辨率人体姿态估计研究[J]. 计算机工程与应用, 2021, 57(8): 126-132.
[9]	李现国，冯欣欣，李建雄. 多尺度残差网络的单幅图像超分辨率重建[J]. 计算机工程与应用, 2021, 57(7): 215-221.
[10]	梁芳烜，杨锋，卢丽云，尹梦晓. 基于卷积神经网络的脑肿瘤分割方法综述[J]. 计算机工程与应用, 2021, 57(7): 34-43.
[11]	杨培伟，周余红，邢岗，田智强，许夏瑜. 卷积神经网络在生物医学图像上的应用进展[J]. 计算机工程与应用, 2021, 57(7): 44-58.
[12]	常昊，陈晓雷，张爱华，李策，林冬梅. 嵌入改进SENet的卷积神经网络连续血压预测[J]. 计算机工程与应用, 2021, 57(7): 130-135.
[13]	王翀，韩振奇，徐浩煜，祝永新，徐胜，陈夏. 基于改进显著图的高效裂纹检测算法[J]. 计算机工程与应用, 2021, 57(6): 219-224.
[14]	黄金杰，蔺江全，何勇军，何瑾洁，王雅君. 局部语义与上下文关系的中文短文本分类算法[J]. 计算机工程与应用, 2021, 57(6): 94-100.
[15]	贺钰博，刘坤. 基于卷积神经网络的海面显著性目标检测[J]. 计算机工程与应用, 2021, 57(6): 108-116.

大权值抑制策略用于训练卷积神经网络

Large Weight Suppression Strategy for Training Convolutional Neural Networks

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics