Flexible and Efficient Hardware Design for Neural Network Pooling Layer
HE Zeng, ZHU Guoquan, YUE Keqiang
1.School of Electronic Information, Hangzhou Dianzi University, Hangzhou 310018, China
2.Intelligent Computing Hardware Research Center, Zhijiang Laboratory, Hangzhou 311100, China
[1] DING W,HUANG Z Y,HUANG Z,et al.Designing efficient accelerator of depthwise separable convolutional neural network on FPGA[J].Journal of Systems Architecture,2019,97:278-286.
[2] KIM J,HUR S,LEE E,et al.NLP-Fast:a fast,scalable,and flexible system to accelerate large-scale heterogeneous NLP models[C]//Proceedings of the 2021 30th International Conference on Parallel Architectures and Compilation Techniques,2021:75-89.
[3] KARPATHY A,LI F F.Deep visual-semantic alignments for generating image descriptions[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):664-676.
[4] 陈浩敏,姚森敬,席禹,等.YOLOv3-tiny的硬件加速设计及FPGA实现[J].计算机工程与科学,2021,43(12):2139-2149.
CHEN H M,YAO S J,XI Y,et al.Design and FPGA implementation of YOLOV3-tiny hardware acceleration[J].Computer Engineering and Science,2021,43(12):2139-2149.
[5] 许杰,张子恒,王新宇,等.一种基于Zynq的CNN加速器设计与实现[J].计算机技术与发展,2021,31(11):108-113.
XU J,ZHANG Z H,WANG X Y,et al.Design and implementation of CNN accelerator based on Zynq[J].Computer Technology and Development,2021,31(11):108-113.
[6] CHO M,KIM Y.Implementation of data-optimized FPGA-based accelerator for convolutional neural network[C]//Proceedings of the 2020 International Conference on Electronics,Information,and Communication,2020:1-2.
[7] 王肖,邓军勇,谢晓燕.可重构卷积神经网络加速器设计与实现[J].传感器与微系统,2022,41(2):82-85.
WANG X,DENG J Y,XIE X Y.Design and implementation of reconfigurable CNN accelerator[J].Sensor and MicroSystem,2022,41(2):82-85.
[8] 魏武,杨靓.图像处理中数据复用及存储层次设计的研究[J].计算机技术与发展,2012,22(12):43-46.
WEI W,YANG L.Data reuse and memory hierarchy design in image processing[J].Computer Technology and Development,2012,22(12):43-46.
[9] ZHANG X.The AlexNet,LeNet-5 and VGG NET applied to CIFAR-10[C]//Proceedings of the 2021 2nd International Conference on Big Data & Artificial Intelligence & Software Engineering,2021:414-419.
[10] ANTIOQUIA A M C,TAN D S,AZCARRAGA A,et al.ZipNet:ZFNet-level accuracy with 48× fewer parameters[C]//Proceedings of the 2018 IEEE Visual Communications and Image Processing,2018:1-4.
[11] ASWATHY P,SIDDHARTHA,MISHRA D.Deep GoogLeNet features for visual object tracking[C]//Proceedings of the 2018 IEEE 13th International Conference on Industrial and Information Systems,2018:60-66.
[12] CHEN H Y,SU C Y.An enhanced hybrid MobileNet[C]//Proceedings of the 2018 9th International Conference on Awareness Science and Technology,2018:308-312.
[13] ZHANG K,GUO Y,WANG X,et al.Multiple feature reweight densenet for image classification[J].IEEE Access,2019,7:9872-9880.
[14] JIANG C,ZHANG H,YUE Y,et al.AM-YOLO:improved YOLOV4 based on attention mechanism and multi-feature fusion[C]//Proceedings of the 2022 IEEE 6th Information Technology and Mechatronics Engineering Conference,2022:1403-1407.
[15] IEAMSAARD J,CHAROENSOOK S N,YAMMEN S.Deep learning-based face mask detection using YoloV5[C]//Proceedings of the 2021 9th International Electrical Engineering Congress,2021:428-431.
[16] Chilicyy.YOLOv6 release 0.1.0[CP/OL].(2022-06)[2022-07-19].https://github.com/meituan/YOLOv6.
[17] 杨维科.基于RISC-V开源处理器的卷积神经网络加速器设计方法研究[D].上海:上海交通大学,2018.
YANG W K.Research on design method of convolutional neural network accelerator based on RISC-V open source processor[D].Shanghai:Shanghai Jiao Tong University,2018.
[18] 张卫,刘宇红,张荣芬.可实现时分复用的CNN卷积层和池化层IP核设计[J].计算机工程与应用,2020,56(24):66-71.
ZHANG W,LIU Y H,ZHANG R F.Design of IP cores for CNN convolution layer and pooling layer capable of time division multiplexing[J].Computer Engineering and Applications,2020,56(24):66-71.