Sparse Binary Programming Method for Pruning of Randomly Initialized Neural Networks
LU Lin, JI Fanfan, YUAN Xiaotong
1.School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, China
2.Jiangsu Key Laboratory of Big Data Analysis Technology, Nanjing 210044, China
3.School of Computer, Nanjing University of Information Science and Technology, Nanjing 210044, China
LU Lin, JI Fanfan, YUAN Xiaotong. Sparse Binary Programming Method for Pruning of Randomly Initialized Neural Networks[J]. Computer Engineering and Applications, 2023, 59(8): 138-147.
[1] 张良,张增,舒伟华,等.基于YOLOv3的卷积层结构化剪枝[J].计算机工程与应用,2021,57(6):131-137.
ZHANG L,ZHANG Z,SHU W H,et al.Convolutional layered pruning based on YOLOv3[J].Computer Engineering and Applications,2021,57(6):131-137.
[2] 黄文斌,陈仁文,袁婷婷.改进YOLOv3-SPP的无人机目标检测模型压缩方案[J].计算机工程与应用,2021,57(21):165-173.
HUANG W B,CHEN R W,YUAN T T.Compression of UAV object detection model based on improved YOLOv3-SPP[J].Computer Engineering and Applications,2021,57(21):165-173.
[3] LECUN Y,DENKER J S,SOLLA S A.Optimal brain damage[C]//Advances in Neural Information Processing Systems,1990:598-605.
[4] HAN S,POOL J,TRAN J,et al.Learning both weights and connections for efficient neural networks[J].arXiv:1506.02626,2015.
[5] LI H,KADAV A,DURDANOVIC I,et al.Pruning filters for efficient convnets[J].arXiv:1608.08710,2016.
[6] HE Y,KANG G,DONG X,et al.Soft filter pruning for accelerating deep convolutional neural networks[J].arXiv:1808.06866,2018.
[7] LUO J H,WU J,LIN W.Thinet:a filter level pruning method for deep neural network compression[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:5058-5066.
[8] FRANKLE J,CARBIN M.The lottery ticket hypothesis:finding sparse,trainable neural networks[C]//International Conference on Learning Representations,2018.
[9] LEE N,AJANTHAN T,TORR P H S.Snip:single-shot network pruning based on connection sensitivity[J].arXiv:1810.02340,2018.
[10] WANG C,ZHANG G,GROSSE R.Picking winning tickets before training by preserving gradient flow[J].arXiv:2002.07376,2020.
[11] WANG Y,ZHANG X,XIE L,et al.Pruning from scratch[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020:12273-12280.
[12] ZHOU H,LAN J,LIU R,et al.Deconstructing lottery tickets:zeros,signs,and the supermask[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems,2019:3597-3607.
[13] RAMANUJAN V,WORTSMAN M,KEMBHAVI A,et al.What’s hidden in a randomly weighted neural network?[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:11893-11902.
[14] HOEFLER T,ALISTARH D,BEN-NUN T,et al.Sparsity in deep learning:pruning and growth for efficient inference and training in neural networks[J].arXiv:2102.00554,2021.
[15] ZHOU X,ZHANG W,XU H,et al.Effective sparsification of neural networks with global sparsity constraint[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:3599-3608.
[16] COURBARIAUX M,HUBARA I,SOUDRY D,et al.Binarized neural networks:training deep neural networks with weights and activations constrained to +1 or -1[J].arXiv:1602.02830,2016.
[17] KRIZHEVSKY A.Learning multiple layers of features from tiny images[D].University of Tront,2009.
[18] DENG J,DONG W,SOCHER R,et al.Imagenet:a large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition,2009:248-255.
[19] FRANKLE J,DZIUGAITE G K,ROY D M,et al.Stabilizing the lottery ticket hypothesis[J].arXiv:1903.01611,2019.
[20] MALACH E,YEHUDAI G,SHALEV-SCHWARTZ S,et al.Proving the lottery ticket hypothesis:pruning is all you need[C]//International Conference on Machine Learning,2020:6682-6691.
[21] ORSEAU L,HUTTER M,RIVASPLATA O.Logarithmic pruning is all you need[C]//Advances in Neural Information Processing Systems,2020.
[22] PENSIA A,RAJPUT S,NAGLE A,et al.Optimal lottery tickets via subsetsum:logarithmic over-parameterization is sufficient[J].arXiv:2006.07990,2020.
[23] 张彪,杨朋波,桑基韬,等.基于特征归因重要度评价的卷积网络剪枝[J].中国科学:信息科学,2021,51(1):13-26.
ZHANG B,YANG P B,SANG J T,et al.Convolution network pruning based on the evaluation of the importance of characteristic attributions[J].Scientia Sinica(Informationis),2021,51(1):13-26.
[24] TANAKA H,KUNIN D,YAMINS D L K,et al.Pruning neural networks without any data by iteratively conserving synaptic flow[J].arXiv:2006.05467,2020.
[25] BENGIO Y,LéONARD N,COURVILLE A.Estimating or propagating gradients through stochastic neurons for conditional computation[J].arXiv:1308.3432,2013.
[26] HE K,ZHANG X,REN S,et al.Delving deep into rectifiers:surpassing human-level performance on imagenet classification[C]//Proceedings of the IEEE International Conference on Computer Vision,2015:1026-1034.
[27] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778.
[28] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.
1556,2014.
[29] LOSHCHILOV I,HUTTER F.Sgdr:Stochastic gradient descent with warm restarts[J].arXiv:1608.03983,2016.
[30] ZAGORUYKO S,KOMODAKIS N.Wide residual networks[J].arXiv:1605.07146,2016.