Neural Network Optimization Method Based on Invalid Filters Weight Regression

doi:10.3778/j.issn.1002-8331.2011-0239

Abstract

Abstract:

In the process of neural network model training, some filters degenerate into invalid filters and lose their function in the process of neural network inference. In order to solve this problem, a new learning paradigm which using the model itself can efficiently re-activate the invalid filters and improve the capability of model is proposed. Firstly, a Convolutional Neural Network（CNN） is initialized and well-trained using general methods. Then, the importance of convolution filters is measured using two methods, including L1-norm and filter correlation. Finally, weights of invalid filters are rewound to their values earlier in training, and the whole network is re-trained. Extensive experiments on classification tasks using CIFAR-10 and CIFAR-100 datasets demonstrate the effectiveness of this learning paradigm. Training anneal is applicable both on residual and lightweight networks, and invalid filters are re-activated effectively. Compared with the previous methods to improve CNN model, training anneal achieves the best effect at low cost. The accuracy of image classification is improved by 0.93% on average.

Key words: Convolutional Neural Network（CNN）, image classification, filter replacement, filter efficiency

摘要：

在神经网络模型训练过程中，存在部分卷积核退化为无效卷积核，在神经网络推理过程失去作用的问题。针对该问题，提出了一种仅使用单个模型就能在训练过程中激活无效卷积核，提高模型性能的方法。首先将初始模型训练至收敛时刻；然后通过L1正则和卷积核相关性两种方式衡量卷积核的有效性；最后将无效卷积核的权值回退到模型训练的初期阶段并对模型进行重训练。在CIFAR-10、CIFAR-100等图像分类的数据集上的实验结果表明，无论是在残差网络还是在轻量级网络上，提出的方法都能有效地恢复无效卷积核，提高神经网络模型精度。相比之前的方法，该方法在低代价下达到了最佳效果，在图像分类任务上平均提高了0.93%的准确率。

关键词: 卷积神经网络（CNN）, 图像分类, 卷积核替换, 卷积核有效性

GU Shanghang, ZHANG Lijun, GUO Yuechao, XU Yong. Neural Network Optimization Method Based on Invalid Filters Weight Regression[J]. Computer Engineering and Applications, 2021, 57(22): 86-91.

顾上航，张利军，郭越超，徐勇. 基于无效卷积核权值回退的神经网络优化方法[J]. 计算机工程与应用, 2021, 57(22): 86-91.

References

[1] KRIZHEVSKY A，SUTSKERVER I，HINTON G E，et al.ImageNet classification with deep convolutional neural networks[C]//26th Annual Conference on Neural Information Processing Systems，2012：1097-1105.
[2] RUSSAKOBSKY O，DENG J，SU H，et al.ImageNet large scale visual recognition challenge[J].International Journal of Computer Vision，2015，115（3）：211-252.
[3] LIN T，MAIRE M，BELONGIE S，et al.Microsoft COCO：common objects in context[C]//13th European Conference on Computer Vision，2014：740-755.
[4] LONG J，SHELHAMER E，DARRELL T，et al.Fully convolutional networks for semantic segmentation[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition，2015：3431-3440.
[5] SIMONYAN K，ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv：1409.
1556，2014.
[6] SZEGEDY C，LIU W，JIA Y Q，et al.Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition，2015：1-9.
[7] HE K M，ZHANG X Y，REN S Q，et al.Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition，2016：770-778.
[8] LI H，KADAV A，DURDANOVIC I，et al.Pruning filters for efficient ConvNets[C]//5th International Conference on Learning Representations，2017.
[9] FRANKLE J，CARBIN M.The lottery ticket hypothesis：finding sparse，trainable neural networks[C]//7th International Conference on Learning Representations，2019.
[10] MOLCHANOV P，MALLYA A，TYTEE S，et al.Importance estimation for neural network pruning[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：11264-11272.
[11] HAN S，MAO H Z，DALLY W J.Deep compression：compressing deep neural networks with pruning，trained quantization and Huffman coding[J].arXiv：1510.00149，2015.
[12] LIU Z，LI J G，SHEN Z Q，et al.Learning efficient convolutional networks through network slimming[C]//2017 IEEE International Conference on Computer Vision，2017：2755-2763.
[13] 卢海伟，袁晓彤.基于层融合特征系数的动态网络结构化剪枝[J].模式识别与人工智能，2019，32（11）：1051-1059.
LU H W，YUAN X T.Structural pruning of dynamic networks based on feature coefficients of layer fusion[J].Pattern Recognition and Artificial Intelligence，2019，32（11）：1051-1059.
[14] MENG F X，CHENG H，LI K，et al.Filter grafting for deep neural networks[C]//2020 IEEE Conference on Computer Vision and Pattern Recognition，2020：6598-6606.
[15] CHENG H，MENG F X，LI K，et al.Filter grafting for deep neural networks：reason，method，and cultivation[J].arXiv：2004.12311，2004.
[16] HINTON G，VINYALS O，DEAN J.Distilling the knowledge in a neural network[J].arXiv：1503.02531，2015.
[17] CHO J H，HARIHARAN B.On the efficacy of knowledge distillation[C]//2019 IEEE/CVF International Conference on Computer Vision，2019：4794-4802.
[18] XIE Q Z，HOVY E H，LUONG M T，et al.Self-training with noisy student improves ImageNet classification[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：10684-10695.
[19] ZHANG Y，XIANG T，HOSPEDALES T M，et al.Deep mutual learning[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition，2018：4320-4328.
[20] HAN K，WANG Y H，TIAN Q，et al.GhostNet：more features from cheap operations[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：1580-1589.
[21] PRAKASH A，STORER J A，FLORENCIO D，et al.RePr：improved training of convolutional filters[C]//2019 IEEE/CVF International Conference on Computer Vision，2019：10666-10675.
[22] RENDA A，FRANKLE J，CARBIN M，et al.Comparing rewinding and fine-tuning in neural network Pruning[C]//7th International Conference on Learning Representations，2019.
[23] SANDLER M，HOWARD A，ZHU M L，et al.MobileNetV2：inverted residuals and linear bottlenecks[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition，2018：4510-4520.
[24] 周俊宇，赵艳明.卷积神经网络在图像分类和目标检测应用综述[J].计算机工程与应用，2017，53（13）：34-41.
ZHOU J Y，ZHAO Y M.Application of convolution neural network in image classification and object detection[J].Computer Engineering and Applications，2017，53（13）：34-41.
[25] 史加荣，马媛媛.深度学习的研究进展与发展[J].计算机工程与应用，2018，54（10）：1-10.
SHI J R，MA Y Y.Research progress and development of deep learning[J].Computer Engineering and Applications，2018，54（10）：1-10.