计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (8): 138-147.DOI: 10.3778/j.issn.1002-8331.2112-0091

• 模式识别与人工智能 • 上一篇    下一篇

随机初始化神经网络剪枝的稀疏二值规划方法

陆林,季繁繁,袁晓彤   

  1. 1.南京信息工程大学 自动化学院,南京 210044
    2.江苏省大数据分析技术重点实验室,南京 210044
    3.南京信息工程大学 计算机学院,南京 210044
  • 出版日期:2023-04-15 发布日期:2023-04-15

Sparse Binary Programming Method for Pruning of Randomly Initialized Neural Networks

LU Lin, JI Fanfan, YUAN Xiaotong   

  1. 1.School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, China
    2.Jiangsu Key Laboratory of Big Data Analysis Technology, Nanjing 210044, China
    3.School of Computer, Nanjing University of Information Science and Technology, Nanjing 210044, China
  • Online:2023-04-15 Published:2023-04-15

摘要: 传统深度神经网络剪枝方法往往以预训练模型为初始网络并需要在剪枝后进行微调。受到近年来edge-popup等基于随机初始化网络的剪枝算法优异性能的启发,提出了一种基于稀疏二值规划的随机初始化网络剪枝算法。该算法将剪枝训练过程建模为一个稀疏二值约束优化问题。其核心思想是利用稀疏二值规划来学习一个二值掩膜,利用该掩膜可以从随机初始化的神经网络上裁剪出一个未经训练却性能良好的稀疏网络。与之前基于随机初始化网络的剪枝算法相比,该算法找到的稀疏网络在多个稀疏度下都具有更好的分类泛化性能。与edge-popup算法相比,在ImageNet数据集分类任务中,模型在稀疏度为70%时精度提升7.98个百分点。在CIFAR-10数据集分类任务中,模型在稀疏度为50%时精度提升2.48个百分点。

关键词: 神经网络剪枝, 随机初始化, 二值掩膜, 二值规划, 稀疏优化

Abstract: Classical pruning algorithms for deep neural networks typically need to pre-train the model before pruning and fine tune the sparse network after pruning. Inspired by the recent remarkable success of random initialization based pruning methods such as edge-popup, this paper proposes a sparse binary programming method for random network pruning. The algorithm models the pruning training process as a sparse binary constrained optimization problem. The core idea is to use sparse binary programming to learn a binary mask, using which an untrained but well-performing sparse network can be pruned from a randomly initialized neural network. Compared with previous pruning algorithms based on randomly initialized networks, the sparse network found by this algorithm has better classification generalization performance at multiple sparsity degrees. Compared with the edge-popup algorithm, the model improves the accuracy by 7.98 percentage points at 70% sparsity in the ImageNet dataset classification task. In the CIFAR-10 dataset classification task, the model improves 2.48 percentage points accuracy at 50% sparsity.

Key words: neural network pruning, random initialization, binary mask, binary programming, sparse optimization