计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (22): 86-91.DOI: 10.3778/j.issn.1002-8331.2011-0239

• 理论与研发 • 上一篇    下一篇

基于无效卷积核权值回退的神经网络优化方法

顾上航,张利军,郭越超,徐勇   

  1. 哈尔滨工业大学(深圳) 计算机科学与技术学院,广东 深圳 518000
  • 出版日期:2021-11-15 发布日期:2021-11-16

Neural Network Optimization Method Based on Invalid Filters Weight Regression

GU Shanghang, ZHANG Lijun, GUO Yuechao, XU Yong   

  1. School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518000, China
  • Online:2021-11-15 Published:2021-11-16

摘要:

在神经网络模型训练过程中,存在部分卷积核退化为无效卷积核,在神经网络推理过程失去作用的问题。针对该问题,提出了一种仅使用单个模型就能在训练过程中激活无效卷积核,提高模型性能的方法。首先将初始模型训练至收敛时刻;然后通过L1正则和卷积核相关性两种方式衡量卷积核的有效性;最后将无效卷积核的权值回退到模型训练的初期阶段并对模型进行重训练。在CIFAR-10、CIFAR-100等图像分类的数据集上的实验结果表明,无论是在残差网络还是在轻量级网络上,提出的方法都能有效地恢复无效卷积核,提高神经网络模型精度。相比之前的方法,该方法在低代价下达到了最佳效果,在图像分类任务上平均提高了0.93%的准确率。

关键词: 卷积神经网络(CNN), 图像分类, 卷积核替换, 卷积核有效性

Abstract:

In the process of neural network model training, some filters degenerate into invalid filters and lose their function in the process of neural network inference. In order to solve this problem, a new learning paradigm which using the model itself can efficiently re-activate the invalid filters and improve the capability of model is proposed. Firstly, a Convolutional Neural Network(CNN) is initialized and well-trained using general methods. Then, the importance of convolution filters is measured using two methods, including L1-norm and filter correlation. Finally, weights of invalid filters are rewound to their values earlier in training, and the whole network is re-trained. Extensive experiments on classification tasks using CIFAR-10 and CIFAR-100 datasets demonstrate the effectiveness of this learning paradigm. Training anneal is applicable both on residual and lightweight networks, and invalid filters are re-activated effectively. Compared with the previous methods to improve CNN model, training anneal achieves the best effect at low cost. The accuracy of image classification is improved by 0.93% on average.

Key words: Convolutional Neural Network(CNN), image classification, filter replacement, filter efficiency