CPU-GPU融合架构上的缓存性能分析与优化

doi:10.3778/j.issn.1002-8331.1503-0333

计算机工程与应用 ›› 2017, Vol. 53 ›› Issue (2): 47-52.DOI: 10.3778/j.issn.1002-8331.1503-0333

CPU-GPU融合架构上的缓存性能分析与优化

孙传伟，安虹，孙荪，陈俊仕

中国科学技术大学计算机科学与技术学院，合肥 230027

出版日期:2017-01-15 发布日期:2017-05-11

Performance evaluation and optimization of cache on fused CPU-GPU architecture

SUN Chuanwei, AN Hong, SUN Sun, CHEN Junshi

School of Computer Science and Technology, University of Science and Technology of China, Hefei 230027, China

Online:2017-01-15 Published:2017-05-11

摘要/Abstract

摘要： 现今CPU和GPU的发展已经出现新的瓶颈，将两者“结合”在同一块芯片上成为一种新的趋势。这种新的异构架构给片上共享资源的管理带来压力。而共享末级缓存（LLC）的管理对性能的影响非常关键。由于CPU程序和GPU程序的不同特性，给CPU和GPU间共享的末级缓存管理带来新的挑战。通过分析GPU程序访存特征，借鉴之前的缓存管理方案，提出对CPU-GPU融合系统的末级缓存进行等量的静态划分和最优静态划分的方案。实验结果表明：通过缓存划分可以有效避免CPU和GPU程序间的干扰。与传统LRU策略相比，等量静态划分和最优静态划分可以使系统整体性能分别提高7.68%和11.62%。

关键词: 异构架构, 融合, 共享末级缓存, 静态缓存划分

Abstract: Nowadays the development of the CPU and GPU has met a new bottleneck. “Combination” of the CPUs and GPUs on the same chip has become a new popular architectural trend. These new heterogeneous architectures put more pressure on shared resource management. Particularly, the management of Last-Level Cache（LLC） is very important to performance. Due to the different characteristics of the CPU and GPU applications, managing the shared LLC between CPUs and GPUs brings new challenges. In this paper, the GPU applications’ features are analyzed. And the half-to-half and optimal cache partition on the fused architecture are proposed by absorbing previous cache management schemes. Experimental results indicate that static cache partition can effectively avoid the interference between CPU and GPU applications. Compared to LRU, half-to-half and optimal cache partition improves performance by 7.68% and 11.68% respectively.

Key words: heterogeneous architecture, fusion, shared last-level cache, static cache partition

孙传伟，安虹，孙荪，陈俊仕. CPU-GPU融合架构上的缓存性能分析与优化[J]. 计算机工程与应用, 2017, 53(2): 47-52.

SUN Chuanwei, AN Hong, SUN Sun, CHEN Junshi. Performance evaluation and optimization of cache on fused CPU-GPU architecture[J]. Computer Engineering and Applications, 2017, 53(2): 47-52.

[1]	陆莉霞，邹俊忠，郭玉成，张见，王蓓. 多模态融合的膝关节损伤预测[J]. 计算机工程与应用, 2021, 57(9): 225-232.
[2]	李明山，韩清鹏，张天宇，王道累. 改进SSD的安全帽检测方法[J]. 计算机工程与应用, 2021, 57(8): 192-197.
[3]	槐创锋，郭龙，贾雪艳，张子昊. 改进A*算法与动态窗口法的机器人动态路径规划[J]. 计算机工程与应用, 2021, 57(8): 244-248.
[4]	郭晓静，隋昊达. 改进YOLOv3在机场跑道异物目标检测中的应用[J]. 计算机工程与应用, 2021, 57(8): 249-255.
[5]	王兵，乐红霞，李文璟，张孟涵. 改进YOLO轻量化网络的口罩检测算法[J]. 计算机工程与应用, 2021, 57(8): 62-69.
[6]	张越，黄友锐，刘鹏坤. 引入注意力机制的多分辨率人体姿态估计研究[J]. 计算机工程与应用, 2021, 57(8): 126-132.
[7]	董旭彬，赵清华. 改进Mask R-CNN在航空影像目标检测的研究应用[J]. 计算机工程与应用, 2021, 57(8): 133-144.
[8]	王玲，王家沛，王鹏，孙爽滋. 融合注意力机制的孪生网络目标跟踪算法研究[J]. 计算机工程与应用, 2021, 57(8): 169-174.
[9]	李中道，刘元盛，常飞翔，张军，路铭. 室内环境下UWB与LiDAR融合定位算法研究[J]. 计算机工程与应用, 2021, 57(6): 260-266.
[10]	刘畅，邱卫根，张立臣. 基于可变形掩膜对齐卷积模型的行人再识别[J]. 计算机工程与应用, 2021, 57(5): 146-152.
[11]	韩文静，罗晓曙，杨日星. 一种复合型手势识别方法研究[J]. 计算机工程与应用, 2021, 57(4): 108-113.
[12]	赵辉，李志伟，方禄发. 特征信息增强的单发多框检测器算法[J]. 计算机工程与应用, 2021, 57(4): 148-154.
[13]	顾梅花，王苗苗，李立瑶，冯婧. 彩色图像多尺度融合灰度化算法[J]. 计算机工程与应用, 2021, 57(4): 209-215.
[14]	王殿伟，赵梦影，刘颖，宋海军，谢永军. 改进的R-SSD全景视频图像车辆检测算法[J]. 计算机工程与应用, 2021, 57(3): 189-195.
[15]	肖瑞雪，冯英伟，屈建萍. 结合高效特征融合的可变尺寸图像隐写分析[J]. 计算机工程与应用, 2021, 57(24): 126-134.

CPU-GPU融合架构上的缓存性能分析与优化

Performance evaluation and optimization of cache on fused CPU-GPU architecture

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics