Large Weight Suppression Strategy for Training Convolutional Neural Networks

doi:10.3778/j.issn.1002-8331.1809-0175

Abstract

Abstract: Convolutional Neural Network（CNN） is an important research direction for deep learning. Because its model is complex and difficult to train, designing a better training method for CNN has always been a research hotspot. It analyzes the influence of its weights on the training results based on the trained CNN model, and confirms that the greater weights have greater impact on the network performance. The performance of the model mainly depends on the few large weights. Weight Restrain of CNN（WR-CNN） training method is proposed, which adjusts the weight updating strategy during model training and sets a suppression coefficient related to the value of the weight. The coefficient adjusts the weight increment of the back propagation to achieve the purpose of controlling the distribution of connection with a large weight. The experimental results show that the WR-CNN training method can effectively improve the training performance of CNN. Under different experimental conditions, the method reduces the error rate of the CNN model by 1.8%~5.0%, the sensitivity of the model to the connection with a lager weight is significantly reduced and the model generalization ability and robustness are improved. The WR-CNN training method can be used to re-optimize the trained CNN model.

Key words: convolutional neural network, weight parameters, suppression coefficient, generalization ability, robustness

摘要： 卷积神经网络（Convolutional Neural Network，CNN）是深度学习研究的重要方向，因其模型复杂且训练困难，所以设计更好的CNN训练方法一直是研究热点。针对训练好的CNN模型，分析了其参数权值对训练结果的影响，确认权值越大的连接参数对模型性能的影响也越大，且整个模型的性能主要由极少数的大权值参数决定。据此，提出了CNN的权值抑制训练方法（Weight Restrain of CNN，WR-CNN），该方法调整了模型训练时的权值更新策略，设置一个与权值大小相关的抑制系数，用该系数调整反向传播时的权值增量幅度，达到控制大权值连接参数分布的目的。在不同实验条件下，该方法将CNN模型的错误率降低1.8%~5.0%，模型对大权值参数的敏感性明显降低，模型泛化能力和鲁棒性均得到改善，另外，该方法也可用于对已训练好的网络模型进行再优化。

关键词: 卷积神经网络, 权值参数, 抑制系数, 泛化能力, 鲁棒性

FAN Chunlong, HE Yufeng, WANG Yixin. Large Weight Suppression Strategy for Training Convolutional Neural Networks[J]. Computer Engineering and Applications, 2020, 56(2): 115-119.

范纯龙，何宇峰，王翼新. 大权值抑制策略用于训练卷积神经网络[J]. 计算机工程与应用, 2020, 56(2): 115-119.

[1]	RAN Rong, XU Xinghua, QIU Shaohua, CUI Xiaopeng, OUYANG Bin. Review of Crack Detection Methods Based on Deep Convolutional Neural Networks [J]. Computer Engineering and Applications, 2021, 57(9): 23-35.
[2]	MOU Qingping, ZHANG Ying, ZHANG Dongbo, WANG Xinjie, YANG Zhiqiao. Research on Visual Tracking Algorithm and Application of Target Loss Discrimination Mechanism [J]. Computer Engineering and Applications, 2021, 57(9): 140-147.
[3]	BAO Zhiqiang, XING Yu, LYU Shaoqing, HUANG Qiongdan. Improved YOLO V2 6D Object Pose Estimation Algorithm [J]. Computer Engineering and Applications, 2021, 57(9): 148-153.
[4]	HUANG Dongyi, YANG Bing, WU Zihao, KUANG Jiayi, YAN Zeming. Spatio-Temporal Fully Connected Convolutional Neural Networks for Citywide Cellular Prediction [J]. Computer Engineering and Applications, 2021, 57(9): 168-175.
[5]	ZHAO Zhiyan, YANG Hua, HU Zhiwei, YU Haiping. Identification Model of Pests on Yuluxiang Pear Leaves Based on TACNN [J]. Computer Engineering and Applications, 2021, 57(9): 176-181.
[6]	ZHOU Lungang, SUN Yifeng, WANG Kun, WU Jiang, HUANG Weigui, LI Binglong. End to End Object Recognition Algorithm for Multi-attributes of Multi-values [J]. Computer Engineering and Applications, 2021, 57(9): 182-190.
[7]	ZHANG Cheng, DAI Junfeng, XIONG Wenxin. Improved Handwritten Date Recognition in Scanned Documents Combined with LeNet-5 [J]. Computer Engineering and Applications, 2021, 57(9): 207-211.
[8]	MA Zhexu, YANG Feng, QIAO Xu. Intelligent Detection Method of Railway Subgrade Defect [J]. Computer Engineering and Applications, 2021, 57(9): 272-278.
[9]	ZHANG Yue, HUANG Yourui, LIU Pengkun. Research on Multi-resolution Human Pose Estimation with Attention Mechanism [J]. Computer Engineering and Applications, 2021, 57(8): 126-132.
[10]	LIANG Fangxuan, YANG Feng, LU Liyun, YIN Mengxiao. Review of Brain Tumor Segmentation Methods Based on Convolutional Neural Networks [J]. Computer Engineering and Applications, 2021, 57(7): 34-43.
[11]	YANG Peiwei, ZHOU Yuhong, XING Gang, TIAN Zhiqiang, XU Xiayu. Applications of Convolutional Neural Network in Biomedical Image [J]. Computer Engineering and Applications, 2021, 57(7): 44-58.
[12]	CHANG Hao, CHEN Xiaolei, ZHANG Aihua, LI Ce, LIN Dongmei. Continuous Blood Pressure Prediction Based on Improved SENet Convolutional Neural Network [J]. Computer Engineering and Applications, 2021, 57(7): 130-135.
[13]	WANG Chong, HAN Zhenqi, XU Haoyu, ZHU Yongxin, XU Sheng, CHEN Xia. Efficient Crack Detection Algorithm Based on Improved Saliency Map [J]. Computer Engineering and Applications, 2021, 57(6): 219-224.
[14]	HUANG Jinjie, LIN Jiangquan, HE Yongjun, HE Jinjie, WANG Yajun. Chinese Short Text Classification Algorithm Based on Local Semantics and Context [J]. Computer Engineering and Applications, 2021, 57(6): 94-100.
[15]	ZHANG Liang, ZHANG Zeng, SHU Weihua, MEI Kuizhi. Convolutional Layered Pruning Based on YOLOv3 [J]. Computer Engineering and Applications, 2021, 57(6): 131-137.

Large Weight Suppression Strategy for Training Convolutional Neural Networks

大权值抑制策略用于训练卷积神经网络

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics