Fusion of Deep Reinforcement Learning in Joint Compression Method for Convolutional Neural Network

doi:10.3778/j.issn.1002-8331.2311-0002

Abstract

Abstract: With the rise of concepts such as edge computing and edge intelligence, the lightweight deployment of convolutional neural network has gradually become a research hotspot. The traditional convolutional neural network compression technique usually performs pruning and quantization strategies in stages and independently, but this method does not consider the interaction between pruning and quantification processes, so that it cannot achieve the optimal pruning and quantification results, which affects the performance of the compressed model. In order to solve the above problems, this paper proposes CoTrim, a joint compression method for neural networks based on deep reinforcement learning. CoTrim performs channel pruning and weight quantization at the same time, and uses the deep reinforcement learning algorithm to search for the global optimal pruning and quantization strategy to balance the impact of pruning and quantization on network performance. Experiments are conducted on the CIFAR-10 dataset on VGG and ResNet, and the experimental results show that for common single branch convolution and residual convolution structures, CoTrim is able to compress the model size of VGG16 to the original 1.41% with a precision loss of only 2.49?percentage points. Experiments are conducted on compact networks MobileNet and DenseNet on a complex dataset Imagenet-1K. The experimental results show that for deep separable convolutional structures and densely connected structures, CoTrim can still ensure accuracy loss within an acceptable range and achieve from 1/5 to 1/8 model size compression.

Key words: convolutional neural network, deep reinforcement learning, model compression, channel pruning, weight quantization, edge intelligence

摘要： 随着边缘计算、边缘智能等概念的兴起，卷积神经网络的轻量化部署逐渐成为研究热点。传统的卷积神经网络压缩技术通常分阶段地、独立地执行剪枝与量化策略，但这种方式没有考虑剪枝与量化过程的相互影响，使其无法达到最优的剪枝与量化结果，影响压缩后的模型性能。针对以上问题，提出一种基于深度强化学习的神经网络联合压缩方法——CoTrim。CoTrim同时执行通道剪枝与权值量化，利用深度强化学习算法搜索出全局最优的剪枝与量化策略，以平衡剪枝与量化对网络性能的影响。在CIFAR-10数据集上对VGG和ResNet进行实验，实验表明，对于常见的单分支卷积和残差卷积结构，CoTrim能够在精度损失仅为2.49个百分点的情况下，将VGG16的模型大小压缩至原来的1.41%。在复杂数据集Imagenet-1K上对紧凑网络MobileNet和密集连接网络DenseNet进行实验，实验表明，对于深度可分离卷积结构以及密集连接结构，CoTrim依旧能保证精度损失在可接受范围内将模型压缩为原始大小的1/5~1/8。

关键词: 卷积神经网络, 深度强化学习, 模型压缩, 通道剪枝, 权值量化, 边缘智能

MA Zuxin, CUI Yunhe, QIN Yongbin, SHEN Guowei, GUO Chun, CHEN Yi, QIAN Qing. Fusion of Deep Reinforcement Learning in Joint Compression Method for Convolutional Neural Network[J]. Computer Engineering and Applications, 2025, 61(6): 210-219.

马祖鑫, 崔允贺, 秦永彬, 申国伟, 郭春, 陈意, 钱清. 融合深度强化学习的卷积神经网络联合压缩方法[J]. 计算机工程与应用, 2025, 61(6): 210-219.

References

[1] LI H, KADAV A, DURDANOVIC I, et al. Pruning filters for efficient ConvNets[C]//Proceedings of the International Conference on Learning Representations, Toulon, France, 2017: 24-26.
[2] CARREIRA-PERPINAN M A, IDELBAYEV Y. Learning compression algorithm for neural network pruning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 8532-8541.
[3] LIU N, MA X, XU Z, et al. AutoCompress: an automatic DNN structured pruning framework for ultra-high compression rates[C]//Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, California, USA, 2020: 4876-4883.
[4] HE Y, LIN J, LIU Z, et al. AMC: autoML for model compression and acceleration on mobile devices[C]//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer, 2018: 784-800.
[5] 刘阳, 滕颖蕾, 牛涛, 等. 基于深度强化学习的滤波器剪枝方案[J]. 北京邮电大学学报, 2023, 46(3): 31-36.
LIU Y, TENG Y L, NIU T, et al. Filter pruning algorithm based on deep reinforcement learning[J]. Journal of Beijing University of Posts and Telecommunications, 2023, 46(3): 31-36.
[6] ZHAO K, JAIN A, ZHAO M. Automatic attention pruning: improving and automating model pruning using attentions[C]//Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023: 10470-10486.
[7] WANG N, CHOI J, BRAND D, et al. Training deep neural networks with 8-bit floating point numbers[C]//Advances in Neural Information Processing Systems, Montréal, Canada, 2018: 7675-7684.
[8] KHORAM S, LI J. Adaptive quantization of neural networks[C]//Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 2018.
[9] WANG K, LIU Z, LIN Y, et al. HAQ: hardware-aware automated quantization with mixed precision[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 8612-8620.
[10] 吕君环, 许柯, 王东. 基于指数移动平均知识蒸馏的神经网络低比特量化方法[J]. 模式识别与人工智能, 2021, 34(12): 1143-1151.
Lü J H, Xü K, WANG D. Low-bit quantization of neural network based on exponential moving average knowledge distillation[J]. Pattern Recognition and Artificial Intelligence, 2021, 34(12): 1143-1151.
[11] JACOB B, KLIGYS S, CHEN B, et al. Quantization and training of neural networks for efficient integer arithmetic only inference[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 2704-2713.
[12] ELTHAKEB A T, PILLIGUNDLA P, MIRESHGHALLAH F, et al. ReLeQ: a reinforcement learning approach for automatic deep quantization of neural networks[J]. IEEE Micro, 2020, 40(5): 37-45.
[13] HAN S, MAO H Z, DALLY W J. Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding[J]. arXiv:1510.00149, 2015.
[14] WANG T, WANG K, CAI H, et al. APQ: joint search for network architecture, pruning and quantization policy[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020: 2078-2087.
[15] 赵旭剑, 李杭霖. 基于混合机制的深度神经网络压缩算法[J]. 计算机应用, 2023, 43(9): 2686-2691.
ZHAO X J, LI H L. Deep neural network compression algorithm based on hybrid mechanism[J]. Journal of Computer Applications, 2023, 43(9): 2686-2691.
[16] RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252.
[17] KRIZHEVSKY A, HINTON G. Learning multiple layers of features from tiny images[J]. Handbook of Systemic Autoimmune Diseases, 2009, 1(4).
[18] HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[J]. arXiv:1704.04861, 2017.
[19] SANDLER M, HOWARD A, ZHU M, et al. MobileNetv2: inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 4510-4520.
[20] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409. 1556, 2014.
[21] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016: 770-778.
[22] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, USA, 2017: 4700-4708.