Neural Network Compression Algorithm Based on Adversarial Learning and Knowledge Distillation
LIU Jinjin, LI Qingbao, LI Xiaonan
1.Information Engineering University, Zhengzhou 450003, China
2.State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450003, China
3.School of Computer Science, Zhongyuan University of Technology, Zhengzhou 450007, China
LIU Jinjin, LI Qingbao, LI Xiaonan. Neural Network Compression Algorithm Based on Adversarial Learning and Knowledge Distillation[J]. Computer Engineering and Applications, 2021, 57(21): 180-187.
[1] HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,June 27-30,2016.Los Alamitos:IEEE Computer Society,2016:770-778.
[2] 鲁统伟,徐子昕,闵锋.基于生成对抗网络的知识蒸馏数据增强[J/OL].计算机工程:1-13[2021-05-18].https://doi.org/10.19678/j.issn.1000-3428.0060395.
LU T W,XU Z X,MIN F.Knowledge distillation data augmentation based on generation adversarial network[J/OL].Computer Engineering,1-13[2021-05-18].https://doi.org/10.19678/j.issn.1000-3428.0060395.
[3] GüLER R A,NEVEROVA N,KOKKINOS I.DensePose:dense human pose estimation in the wild[C]//IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,June 18-22,2018.Los Alamitos:IEEE Computer Society,2018:7297-7306.
[4] LI W,ZHU X T,GONG S G.Person re-identification by deep joint learning of multi-loss classification[C]//International Joint Conference on Artificial Intelligence,Melbourne,August 19-25,2017.San Francisco:Morgan Kaufmann,2017:2194-2200.
[5] SUN Y,CHEN Y,WANG X,et al.Deep learning face representation by joint identification-verification[C]//Conference on Neural Information Processing Systems,Montreal,December 8-13,2014.Cambridge:MIT Press,2014:1988-1996.
[6] MACLAURIN D,DUVENAUD D,ADAMS R P.Early stopping is nonparametric variational inference[J].arXiv:1504.01344,2015.
[7] MAHSERECI M,BALLES L,LASSNER C,et al.Early stopping without a validation set[J].arXiv:1703.09580,2017.
[8] IOFFE S,SZEGEDY C.Batch nor-malization:accelerating deep network training by reducing internal covariate shift[C]//International Conference on Machine Learning,Lille,6-11 July,2015.New York:ACM,2015:448-456.
[9] HINTON G,VINYALS O,DEAN J.Distilling the knowledge in a neural network[J].arXiv:1503.02531,2015.
[10] GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial networks[C]//Conference on Neural Information Processing Systems,Montreal,December 8-13,2014.Cambridge:MIT Press,2014:2672-2680.
[11] 毕鹏程,罗健欣,陈卫卫.轻量化卷积神经网络技术研究[J].计算机工程与应用,2019,55(16):25-35.
BI P C,LUO J X,CHEN W W.Research on lightweight convolutional neural network technology[J].Computer Engineering and Applications,2019,55(16):25-35.
[12] HASSIBI B,STORK D G.Second order derivatives for network pruning:optimal brain surgeon[C]//Conference on Neural Information Processing Systems,Denver,November 30-December 3,1992.Cambridge:MIT Press,1992:164-171.
[13] GONG Y C,LIU L,YANG M,et al.Compressing deep convolutional networks using vector quantization[J].arXiv:1412.6115,2014.
[14] RASTEGARI M,ORDONEZ V,REDMON J,et al.XNOR-net:imagenet classi-fication using binary convolutional neural networks[C]//European Conference on Computer Vision,Amsterdam,October 11-14,2016.Berlin:Springer,2016:525-542.
[15] COURBARIAUX M,BENGIO Y,DAVID J P.Binaryconnect:training deep neural networks with binary weights during propaga-tions[C]//Conference on Neural Information Processing Systems,Montreal,December 7-12,2015.Cambridge:MIT Press,2015:3123-3131.
[16] HAN S,MAO H,DALLY W J.Deep compression:Compressing deep neural networks with pruning,trained quantization and human coding[C]//International Conference on Learning Representations,San Juan,May 2-4,2016.
[17] CHEN W L,WILSON J T,TYREE S,et al.Compressing neural networks with the hashing trick[C]//International Conference on Machine Learning,Lille,6-11 July 2015.New York:ACM,2015:2285-2294.
[18] LI H,KADAV A,DURDANOVIC I,et al.Pruning filters for efficient ConvNets[C]//International Conference on Learning Representations,Toulon,April 24-26,2017.
[19] HOWARD A G,ZHU M L,CHEN B,et al.MobileNets:efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017.
[20] ZHANG X Y,ZHOU X Y,LIN M X,et al.ShuffleNet:an extremely efficient convolutional neural network for mobile devices[C]//IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,June 18-22,2018.Los Alamitos:IEEE Computer Society,2018:6848-6856.
[21] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks[C]//Conference on Neural Information Processing Systems,Lake Tahoe,December 3-6,2012.Cambridge:MIT Press,2012:1097-1105.
[22] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[C]//International Conference on Learning Representations,San Diego,May 7-9,2015.
[23] IANDOLA F N,MOSKEWICZ M W,ASHRAF K,et al.SqueezeNet:Alexnet-level accuracy with 50x fewer parameters and<0.5 mb model size[J].arXiv:1602.07360,2016.
[24] SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the inception architecture for computer vision[C]//IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,June 27-30,2016.Los Alamitos:IEEE Computer Society,2016:2818-2826.
[25] CHOLLET F.Xception:deep learning with depthwise separable convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition,Honolulu,July 21-26,2017.Los Alamitos:IEEE Computer Society,2017:1800-1807.
[26] HUANG G,LIU S C,VAN DER MAATEN L,et al.Condensenet:an efficient densenet using learned group convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,June 18-22,2018.Los Alamitos:IEEE Computer Society,2018:2752-2761.
[27] LEBEDEV V,GANIN Y,RAKHUBA M,et al.Speeding-up convolutional neural networks using fine-tuned CP-decomposition[C]//International Conference on Learning Representations,San Diego,May 7-9,2015.
[28] CHEN Y,FAN H,XU B,et al.Drop an octave:reducing spatial redundancy in convolutional neural networks with octave convolution[C]//Proceedings of the IEEE Interna-tional Conference on Computer Vision,Seoul,October 27-November 2,2019.Piscataway:IEEE,2019:3435-3444.
[29] TAN M,CHEN B,PANG R,et al.Mnasnet:platform-aware neural architecture search for mobile[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Los Angeles,June 16-20,2019.Piscataway:IEEE,2019:2820-2828.
[30] ASIF U,TANG J B,HARRER S.En-semble knowledge distillation for learning improved and efficient networks[C]//European Conference on Artificial Intelligence,Santiago de Compostela,29 August-8 September 2020.Amsterdam:IOS Press,2020:953-960.
[31] CHUNG I,PARK S,KIM J,et al.Feature-map-level online adversarial knowledge distillation[C]//International Conference on Learning Representations,Addis Ababa,April 26-30,2020:2006-2015.
[32] KARLEKAR J,FENG J S,WONG Z S,et al.Deep face recognition model compression via knowledge transfer and distillation[J].arXiv:1906.00619,2019.
[33] BELAGIANNIS V,FARSHAD A,GALASSO F.Adversarial network compression[C]//European Conference on Computer Vision,Munich,September 8-14,2018.Berlin:Springer,2018:431-449.
[34] ZEILER M D,FERGUS R.Visualizing and understanding convolutional networks[C]//European Conference on Computer Vision,Zurich,September 6-12,2014.Berlin:Springer,2014:818-833.
[35] CHO J H,HARIHARAN B.On the efficacy of knowledge distillation[C]//IEEE International Conference on Computer Vision,Seoul,October 27-November 2,2019.Piscataway:IEEE,2019:4793-4801.
[36] ZHANG Y,XIANG T,HOSPEDALES T M,et al.Deep mutual learning[C]//IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,June 18-22,2018.Los Alamitos:IEEE Computer Society,2018:4320-4328.
[37] YI D,LEI Z,LIAO S C,et al.Learning face representation from scratch[J].arXiv:1411.7923,2014.
[38] LIU Z W,LUO P,WANG X G,et al.Deep learning face attributes in the wild[C]//IEEE International Conference on Computer Vision,Santiago,December 7-13,2015.Alamitos:IEEE Computer Society,2015:3730-3738.
[39] XU Z,HSU Y C,HUANG J W.Training shallow and thin networks for acceleration via knowledge distillation with conditional adversarial networks[C]//International Conference on Learning Representations,Vancouver,April 30-May 3,2018.
[40] ROMERO A,BALLAS N,KAHOU S E,et al.Fitnets:hints for thin deep nets[C]//International Conference on Learning Representations,San Diego,May 7-9,2015.
[41] ZHOU Z D,ZHUGE C R,GUAN X W,et al.Channel distillation:channel-wise attention for knowledge distillation[J].arXiv:2006.01683,2020.
[42] LIU P Y,LIU W,MA H D,et al.KTAN:knowledge transfer adversarial network[C]//International Joint Conference on Neural Networks,Glasgow,July 19-24,2020.Piscataway:IEEE,2020:1-7.