Object Detection Network Compression Method Based on Pruning and Quantization

doi:10.3778/j.issn.1002-8331.2105-0134

Abstract

Abstract: Aiming at the problems of redundant parameters, complex model, slow reasoning speed and difficulty of deploying in resource constrained embedded devices in target detection network, a target detection network compression method combining pruning and quantization is proposed. Firstly, the target detection network model is sparsely trained to obtain the scaling factor, and the proportion of channel importance in the convolution layer is calculated according to the distribution of the scaling factor, and the dynamic threshold is calculated according to the scaling factor to cut off the convolution layer which has little contribution to the network model. Then the 32 bit floating-point data are transformed into 8 bit integer data by uniform mapping, which reduces the network computation and reduces the size of the network model. The compression methods of Pedestrian and Vehicle dataset, Hands dataset and VOC2012 dataset are verified by using YOLO series target detection network. The experimental results show that after dynamic threshold pruning and uniform mapping quantization, the network model is compressed from 234 MB to 10 MB under the premise of accuracy loss of 4%, and the detection speed is increased by 5 times, which effectively solves the problem of difficult deployment and application.

Key words: object detection network, model compression, dynamic threshold pruning, scaling factor, uniform mapping quantization

摘要： 针对目标检测网络参数量冗余、模型复杂、推理速度缓慢以及难以部署在资源受限的嵌入式设备等问题，提出一种融合剪枝与量化的目标检测网络压缩方法。首先对目标检测网络模型进行稀疏化训练得到缩放因子，并根据缩放因子的分布计算卷积层中通道重要性的占比，根据缩放因子计算动态阈值将对网络模型贡献小的卷积层剪除。然后通过均匀映射的方式将32位浮点型数据量化成8位整型数据，减少网络计算量的同时压缩网络模型的大小。最后采用YOLO系列目标检测网络对行人与车辆数据集、Hands数据集和VOC2012数据集进行压缩方法验证。实验表明，目标检测网络经过动态阈值剪枝和均匀映射量化后在精度损失4%的前提下，将网络模型从234?MB压缩至10?MB以内，检测速度提升5倍，有效解决了部署应用难的问题。

关键词: 目标检测网络, 模型压缩, 动态阈值剪枝, 缩放因子, 均匀映射量化

YANG Guowei, XU Zhiwang, FANG Chen, WANG Yizhong. Object Detection Network Compression Method Based on Pruning and Quantization[J]. Computer Engineering and Applications, 2022, 58(22): 108-115.

杨国威, 许志旺, 房臣, 王以忠. 融合剪枝与量化的目标检测网络压缩方法[J]. 计算机工程与应用, 2022, 58(22): 108-115.

References

[1] 陆峰，刘华海，黄长缨，等.基于深度学习的目标检测技术综述[J].计算机系统应用，2021，30（3）：1-13.
LU F，LIU H H，HUANG C Y，et al.Survey of target detection technology based on deep learning[J].Computer System Application，2021，30（3）：1-13.
[2] 许德刚，王露，李凡.深度学习的典型目标检测算法研究综述[J].计算机工程与应用，2021，57（8）：10-25.
XU D G，WANG L，LI F.Review of typical object detection algorithms for deep learning[J].Computer Engineering and Applications，2021，57（8）：10-25.
[3] 刘勤让，刘崇阳.利用参数稀疏性的卷积神经网络计算优化及其FPGA加速器设计[J].电子与信息学报，2018，40（6）：1368-1374.
LIU Q R，LIU C Y.Optimization of convolutional neural network based on parameter sparsity and design of FPGA accelerator[J].Acta Electronica Sinica，2018，40（6）：1368-1374.
[4] 靳丽蕾，杨文柱，王思乐，等.一种用于卷积神经网络压缩的混合剪枝方法[J].小型微型计算机系统，2018，39（12）：2596-2601.
JIN L L，YANG W Z，WANG S L，et al.Mixed pruning method for convolutional neural network compression[J].Journal of Chinese Computer Systems，2018，39（12）：2596-2601.
[5] RASTEGARI M，ORDONEZ V，REDMON J，et al.Xnor-net：ImageNet classification using binary convolutiona neural networks[C]//14th European Conference on Computer Vision.Cham：Springer，2016：525-542.
[6] LI F，ZHANG B，LIU B.Ternary weight networks[J].arXiv：1605.04711，2016.
[7] ZHOU A，YAO A，GUO Y，et al.Incremental network quantization：towards lossless CNNs with low-precision weights[J].arXiv：1702.03044，2017.
[8] VANHOUCKE V，SENIOR A，MAO M Z.Improving the speed of neural networks on CPUs[C]//Deep Learning and Unsupervised Feature Learning NIPS Workshop，2011：1-4.
[9] 徐喆，宋泽奇.带比例因子的卷积神经网络压缩方法[J].计算机工程与应用，2018，54（12）：105-109.
XU Z，SONG Z Q.Convolution neural network compression method with scale factor[J].Computer Engineering and Applications，2018，54（12）：105-109.
[10] HAN S，MAO H，DALLY W J.Deep compression：compressing deep neural networks with pruning，trained quantization and Huffman coding[J].Fiber，2015，56（4）：3-7.
[11] 孙彦丽，叶炯耀.基于剪枝与量化的卷积神经网络压缩方法[J].计算机科学，2020，47（8）：261-266.
SUN Y L，YE J Y.Convolutional neural network compression method based on pruning and quantization[J].Computer Science，2020，47（8）：261-266.
[12] 邢姗姗，赵文龙.基于YOLO系列算法的复杂场景下无人机目标检测研究综述[J].计算机应用研究，2020，37（S2）：28-30.
XING S S，ZHAO W L.Overview of UAV target detection in complex scenes based on YOLO series algorithms[J].Computer Application Research，2020，37（S2）：28-30.
[13] 潘成胜，张斌，吕亚娜，等.改进灰狼优化算法的K-Means文本聚类[J].计算机工程与应用，2021，57（1）：188-193.
PAN C S，ZHANG B，LYU Y N，et al.K-Means text clustering based on improved gray wolf optimization algorithm[J].Computer Engineering and Applications，2021，57（1）：188-193.
[14] XIAOS Z，FANG W，CHANG L，at al.FreeAnchor：learning to match anchors for visual object detection[C]//Advances in Neural Information Processing Systems 32，2019：147-155.
[15] KANG M，KANG S.Data-free knowledge distillation in neural networks for regression[J].Expert Systems with Applications，2021，175：114813.
[16] WANG P，HU Q，ZHANG Y，et al.Two-step quantization for low-bit neural networks[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition，2018：4376-4384.
[17] 孙浩然，王伟，陈海宝.基于参数量化的轻量级图像压缩神经网络研究[J].信息技术，2020，44（10）：87-91.
SUN H R，WANG W，CHEN B H.Research on lightweight image compression neural network based on parameter quantization[J].Information Technology，2020，44（10）：87-91.
[18] CROWDER E，LANSIQUOT J.Darknet?data mining—a Canadian cyber-crime perspective[J].arXiv：2105.13957，2021.
[19] 张良，张增，舒伟华，等.基于YOLOv3的卷积层结构化剪枝[J].计算机工程与应用，2021，57（6）：131-137.
ZHANG L，ZHANG Z，SHU W H，et al.Convolutional layered pruning based on YOLOv3[J].Computer Engineering and Applications，2021，57（6）：131-137.
[20] BOCHKOVSKIY A，WANG C Y，LIAO H Y M.YOLOv4：optimal speed and accuracy of object detection[J].arXiv：2004.10934v1，2020.