Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (1): 15-25.DOI: 10.3778/j.issn.1002-8331.2204-0187
• Research Hotspots and Reviews • Previous Articles Next Articles
WAN Duo, HU Moufa, XIAO Shanzhu, ZHANG Yan
Online:
2023-01-01
Published:
2023-01-01
万朵,胡谋法,肖山竹,张焱
WAN Duo, HU Moufa, XIAO Shanzhu, ZHANG Yan. Survey on Heterogeneous Parallel Computing Platform for Edge Intelligent Computing[J]. Computer Engineering and Applications, 2023, 59(1): 15-25.
万朵, 胡谋法, 肖山竹, 张焱. 面向边缘智能计算的异构并行计算平台综述[J]. 计算机工程与应用, 2023, 59(1): 15-25.
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2204-0187
[1] ZAHRAN M.Heterogeneous computing:hardware and software perspectives[M].New York:Association for Computing Machinery,2019:15-18. [2] MOLCHANOVP,TYREE S,KARRAS T,et al.Pruning convolutional neural networks for resource inference[J].arXiv:1611.06440,2016. [3] IANDOLA F N,HAN S,MOSKEWICZ M W,et al.SqueezeNet:AlexNet-level accuracy with 50x fewer parameters and <0.5?MB model size[J].arXiv:1602.07360,2016. [4] LIU Z,LI J G,SHEN Z Q,et al.Learning efficient convolutional networks through network slimming[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:2755-2763. [5] 李欣瑶,刘飞阳,李鹏.嵌入式智能计算加速技术综述[C]//中国航空科学技术大会论文集,2019:996-1004. LI X Y,LIU F Y,LI P.Survey of embedded intelligent computing acceleration technology[C]//Proceedings of the China Aeronautical Science and Technology Conference,2019:996-1004. [6] 陈桂林,马胜,郭阳.硬件加速神经网络综述[J].计算机研究与发展,2019,56(2):240-253. CHEN G L,MA S,GUO Y.Survey on accelerating neural network with hardware[J].Journal of Computer Research and Development,2019,56(2):240-253. [7] 张政馗,庞为光,谢文静,等.面向实时应用的深度学习研究综述[J].软件学报,2020,31(9):2654?2677. ZHANG Z D,PANG W G,XIE W J,et al.Deep learning for real-time applications:A survey[J].Journal of Software,2020,31(9):2654-2677. [8] 卢冶,陈瑶,李涛,等.面向边缘计算的嵌入式FPGA卷积神经网络构建方法[J].计算机研究与发展,2018,55(3):551-562. LU Y,CHEN Y,LI T,et al.Convolutional neural network construction method for embedded FPGAs oriented edge computing[J].Journal of Computer Research and Development,2018,55(3):551-562. [9] 张孟逸.基于FPGA的卷积神经网络并行加速器设计[D].哈尔滨:哈尔滨理工大学,2019. ZHANG M Y.Parallel accelerator design for convolutional neutral networks based on FPGA[D].Harbin:Harbin University of Science and Technology,2019. [10] 中国信息通信研究院.先进计算发展研究报告[EB/OL].(2018-12-19)[2022-03-22].http://www.caict.ac.cn/kxyj/qwfb/bps/201812/P020181218519867225833.pdf. CAICT.Research report on the development of advanced computing[EB/OL].(2018-12-19)[202-03-22].http://www.caict.ac.cn/kxyj/qwfb/bps/201812/P020181218519867225833.pdf. [11] VENKAT A,TULLSEN D M.Harnessing ISA diversity:Design of a heterogeneous-ISA chip multiprocessor[C]//proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture,2014:121-132. [12] 邢景.基于ARM NEON目标检测网络算法的加速技术研究[D].哈尔滨:哈尔滨工业大学,2020. XING J.Research on object detection network algorithm accelerating technology based on ARM NEON[D].Harbin:Harbin Institute of Technology,2020. [13] 刘湘,魏鑫,陈禾,等.应用于ARM-NEON的图像二维卷积高效实现方法[C]//第十二届全国信号和智能信息处理与应用学术会议论文集,2018:1-6. LIU X,WEI X,CHEN H,et al.An efficient implementation for image two-dimensional convolution on ARM-NEON[C]//Proceedings of the 12nd National Conference on Signal and Intelligent Information Processing and Applications,2018:1-6. [14] CADENCE.Enabling embedded vision neural network DSPs[EB/OL].(2021-06-12)[2022-03-22].https://www.cadence.com/content/dam/cadence-www/global/en_US/documents/tools/ip/tensilica-ip/tip-vision-c5.pdf. [15] YANG C,CHEN S M,ZHANG J,et al.A novel DSP architecture for scientific computing and deep learning[J].IEEE Access,2019,7:36413-36425. [16] LINDHOLM E,NICKOLLS J,OBERMAN S,et al.NVIDIA tesla:A unified graphics and computing architecture[J].IEEE Micro,2008,28(2):39-55. [17] 李宏亮,郑方,郝子宇,等.面向智能计算的国产众核处理器架构研究[J].中国科学:信息科学,2019,49(3):247-255. LI H L,ZHENG F,HAO Z Y,et al.Research on homegrown manycore architecture for intelligent computing[J].SCIENTIA SINICA Informationis,2019,49(3):247-255. [18] KECKLER S W,DALLY W J,KHAILANY B,et al.GPUs and the future of parallel computing[J].IEEE Micro,2011,31(5):7-17. [19] LIU S L,DU Z D,TAO J H,et al.Cambricon:An instruction set architecture for neural networks[C]//Proceedings of the ACM/IEEE 43rd Annual International Symposium on Computer Architecture,2016:393-405. [20] 韩栋,周聖元,支天,等.智能芯片的评述和展望[J].计算机研究与发展,2019,56(1):7-22. HAN D,ZHOU S Y,ZHI T,et al.A survey of artificial intelligence chip[J].Journal of Computer Research and Development,2019,56(1):7-22. [21] LI W C,YANG C J,FANG W C.A real-time emotion recognition system based on an AI system-on-chip design[C]//Proceedings of the International SoC Design Conference(ISOCC),2020:29-30. [22] WANG X Y,MAGNO M,CAVIGELLI L,et al.FANN-on-MCU:An open-source toolkit for energy-efficient neural network Inference at the edge of the Internet of Things[J].IEEE Internet of Things Journal,2020,7(5):4403-4417. [23] DESOLI G,CHAWLA N,BOESCH T,et al.A 2.9TOPS/W deep convolutional neural network SoC in FD-SOI 28 nm for intelligent embedded systems[C]//Proceedings of the Solid-State Circuits Conference,2017:238-239. [24] FOLEY D,DANSKIN J.Ultra-performance pascal GPU and NVLink interconnect[J].IEEE Micro,2017,37(2):7-17. [25] 卢丽强,郑思泽,肖倾城,等.面向卷积神经网络的FPGA设计[J].中国科学:信息科学,2019,49(3):277-294. LU L Q,ZHENG S Z,XIAO Q C,et al.Accelerating convolutional neural networks on FPGAs[J].SCIENTIA SINICA Informationis,2019,49(3):277-294. [26] 王明钊,程华,王宇泽,等.基于精度可变乘法器的脉动阵列[J].南京大学学报(自然科学),2020,56(6):885-891. WANG M Z,CHENG H,WANG Y Z,et al.Systolic array based on precision variable multiplier[J].Journal of Nanjing university(Natural Science),2020,56(6):885-891. [27] PARMAR Y,SRIDHARAN K.A high-performance VLSI architecture for a self-feedback convolutional neural network[J].IEEE Transactions on Circuits and Systems II:Express Briefs,2021,68(1):456-460. [28] MOOLCHANDANI D,KUMAR A,SARANGI S R.Accelerating CNN inference on ASICs:A survey[J].Journal of Systems Architecture,2021,113:101887. [29] CHEN Y H,YANG T J,EMER J,et al.Eyeriss V2:A flexible accelerator for emerging deep neural networks on mobile devices[J].IEEE Journal on Emerging and Selected Topics in Circuits and Systems,2019,9(2):292-308. [30] 任小广.面向多级SPM存储的并行程序优化——任务与数据的关联调度[D].长沙:国防科学技术大学,2010. REN X G.Multi-level SPM based parallel programme optimization—associated scheduling of task and data[D].Changsha:National University of Defense Technology,2010. [31] WULF W A,MCKEE S A.Hitting the memory wall[J].ACM SIGARCH Computer Architecture News,1995,23(1):20-24. [32] CHEN T Sh,DU Z D,SUN N H,et al.DianNao:A small-footprint high-throughput accelerator for ubiquitous machine-learning[J].Architecture Support for Programming Languages and Operating Systems,2014,49(4):269-284. [33] GEE J D,SMITH A J.The performance impact of vector processor cashes[C]//Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences,1992:437-448. [34] RYOO S,RODRIGUES C I,STONE S S,et al.Program optimization space pruning for a multithreaded gpu[C]//Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization,2008:195-204. [35] 汪黎.面向软件管理片上存储器的编译优化技术研究[D].长沙:国防科学技术大学,2009. WANG L.Compiler optimization for software-managed on-chip memory[D].Changsha:National University of Defense Technology,2009. [36] 张骏.便笺存储嵌入式系统中多层存储上的数据分配算法研究[D].长沙:湖南大学,2013. ZHANG J.The research on data placement for multiple types of memory units in embedded system with scratch-pad memories[D].Changsha:Hunan University,2011. [37] 刘文志,陈轶,吴长江.OpenCL异构并行计算:原理、机制与优化实践[M].北京:机械工业出版社,2015:1-3. LIU W Z,CHEN Y,WU C J.OpenCL heterogeneous parallel computing:from principle to practice[M].Beijing:China Machine Press,2015:1-3. [38] 周楠,胡娟,胡海明.多核处理器发展趋势及关键技术[J].计算机工程与设计,2018,39(2):393-399. ZHOU N,HU J,HU H M.Development trend and key techniques of multi?core processor[J].Computer Engineering and Design,2018,39(2):393-399. [39] 徐洋.图像去雾算法研究及其异构并行实现[D].哈尔滨:哈尔滨工业大学,2018. XU Y.Research on image dehazing algorithm and heterogeneous parallel implementation[D].Harbin:Harbin Institute of Technology,2018. [40] 陈书明,李振涛,万江华,等.“银河飞腾”高性能数字信号处理器研究进展[J].计算机研究与发展,2006,43(6) :993-1000. CHEN S M,LI Z T,WAN J H,et al.Research and development of high performance YHFT digital signal processor[J].Journal of Computer Research and Development,2006,43(6):993-1000. [41] 杜承垚,袁景凌,陈旻骋,等.GPU加速与L-ORB特征提取的全景视频实时拼接[J].计算机研究与发展,2017,54(6):1316-1325. DU C Y,YUAN J L,CHEN M C,et al.Real-time panoramic video stitching based on GPU acceleration using local ORB feature extraction[J].Journal of Computer Research and Development,2017,54(6):1316-1325. [42] BAO C,XIE T,FENG W B,et al.A power-efficient optimizing framework FPGA accelerator based on Winograd for YOLO[J].IEEE Access,2020,8:94307-94317. [43] DU Z D,FASTHUBER R,CHEN T S,et al.ShiDianNao:Shifting vision processing closer to the sensor[C]//Proceedings of the ACM/IEEE 42nd Annual International Symposium on Computer Architecture,2015:92-104. [44] FANG S X,TIAN L,WANG J B,et al.Real-time object detection and semantic segmentation hardware system with deep learning networks[C]//Proceedings of the International Conference on Field-Programmable Technology(FPT),2018:389-392. [45] 崔洲涓,安军社,陈长龙,等.基于PYNQ框架的深度卷积特征异构跟踪系统[J].计算机工程与应用,2021,57(4):120-126. CUI Z J,AN J H,CHEN C L,et al.Deep convolutional features heterogeneous tracking system based on PYNQ framework[J].Computer Engineering and Applications,2021,57(4):120-126. [46] 吴艳霞,梁楷,刘颖,等.深度学习FPGA加速器的进展与趋势[J].计算机学报,2019,42(11):2461-2480. WU Y X,LIANG K,LIU Y,et al.The progress and trends of FPGA-based accelerators in deep learning[J].Chinese Journal of Computers,2019,42(11):2461-2480. [47] WEI XUN D T,LIM Y L,SRIGRAROM S.Drone detection using YOLOv3 with transfer learning on NVIDIA Jetson TX2[C]//Proceedings of the 2nd International Symposium on Instrumentation,Control,Artificial Intelligence,and Robotics(ICA-SYMP),2021:1-6. [48] MITTAL S.A Survey on optimized implementation of deep learning models on the NVIDIA Jetson platform[J].Journal of Systems Architecture,2019,97:428-442. [49] OTHMAN N A,AYDIN I.A new deep learning application based on movidius NCS for embedded object detection and recognition[C]//Proceedings of the 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies(ISMSIT),2018:1-5. [50] 王云富.基于深度学习的交通标志检测研究与应用[D].大连:大连理工大学,2021. WANG Y F.Research and application of traffic sign detection based on deep learning[D].Dalian:Dalian University of Technology,2021. [51] 杨文静.基于嵌入式的水下鱼类识别技术研究[D].上海:上海海洋大学,2021. YANG W J.Research on underwater fish recognition technology based on embedded[D].Shanghai:Shanghai Ocean University,2021. |
[1] | HU Xinjue, FU Zhangjie. Hiding Two Images with High Visual Quality [J]. Computer Engineering and Applications, 2023, 59(4): 235-242. |
[2] | GAN Yating, AN Jianye, XU Xue. Survey of Short Text Classification Methods Based on Deep Learning [J]. Computer Engineering and Applications, 2023, 59(4): 43-53. |
[3] | YANG Kunrong, XIONG Yu, ZHANG Jian, CHU Wen. Research on MOOC Dropout Prediction Strategy for Long- and Short-Term Mixed Data [J]. Computer Engineering and Applications, 2023, 59(4): 130-138. |
[4] | LI Ling, GUO Guangsong. Hybrid Many-Objective Evolutionary Optimization Combined with Indexs Decomposition [J]. Computer Engineering and Applications, 2023, 59(4): 165-174. |
[5] | ZHANG Han, ZHENG Weihao, DOU Zhicheng, WEN Jirong. Integrating Multi-Layer Structure Information of Law for Legal Judgement Prediction [J]. Computer Engineering and Applications, 2023, 59(3): 253-263. |
[6] | YANG Hanyu, ZHAO Xiaoyong, WANG Lei. Review of Data Normalization Methods [J]. Computer Engineering and Applications, 2023, 59(3): 13-22. |
[7] | CHEN Xiaoting, LI Shi. Survey on Emotion Recognition in Conversation [J]. Computer Engineering and Applications, 2023, 59(3): 33-48. |
[8] | DU Yuzheng, CAO Hui, NIE Yongqi, WEI Dejian, FENG Yanyan. Application of Deep Learning in Classification and Diagnosis of Alzheimer's Disease [J]. Computer Engineering and Applications, 2023, 59(3): 49-65. |
[9] | LIN Honghui, LIU Jianhua, ZHENG Zhixiong, HU Renyuan, LUO Yixuan. Multi-Task Network for Joint Dialog Act Recognition and Sentiment Classification [J]. Computer Engineering and Applications, 2023, 59(3): 104-111. |
[10] | DING Shangshang, ZHENG Tianli, YAO Kang, ZHANG Hetong, PEI Ronghao, FU Weiwei. Deep-Learning-Based Research on Refractive Detection [J]. Computer Engineering and Applications, 2023, 59(3): 193-201. |
[11] | ZHANG Dongdong, GUO Jie, CHEN Yang. 3D Object Detection Algorithm Based on Raw Point Clouds [J]. Computer Engineering and Applications, 2023, 59(3): 209-217. |
[12] | LIN Lingde, LIU Na, WANG Zheng'an. Review of Research on Adapter and Prompt Tuning [J]. Computer Engineering and Applications, 2023, 59(2): 12-21. |
[13] | PEI Wenbin, WANG Hailong, LIU Lin, PEI Dongmei. Review of Musical Instrument Recognition in Music Information Retrieval [J]. Computer Engineering and Applications, 2023, 59(2): 34-47. |
[14] | PAN Mengzhu, LI Qianmu, QIU Tian. Survey of Research on Deep Multimodal Representation Learning [J]. Computer Engineering and Applications, 2023, 59(2): 48-64. |
[15] | WEI Shihong, LIU Hongmei, TANG Hong, ZHU Longjiao. Multilevel Metric Networks for Few-Shot Learning [J]. Computer Engineering and Applications, 2023, 59(2): 94-101. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||