Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (32): 27-30.

• 研究、探讨 • Previous Articles     Next Articles

Parallel algorithm for solving large-scale dense linear system on CUDA

YANG Mei1,LI Zhimin1,CAO Dayong2   

  1. 1.Department of Electrical Engineering,Harbin Institute of Technology,Harbin 150001,China
    2.Department of Applied Mathematics,Harbin University of Science and Technology,Harbin 150080,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-11-11 Published:2011-11-11

CUDA架构下大规模稠密线性方程组的并行求解

杨 梅1,李志民1,曹大勇2   

  1. 1.哈尔滨工业大学 电气工程系,哈尔滨 150001
    2.哈尔滨理工大学 应用数学系,哈尔滨 150080

Abstract: A parallel improved version of the Gauss-Jordan elimination algorithm for solving large-scale dense linear system on CUDA is proposed in this paper.After analyzing the procedure of Gauss-Jordan elimination algorithm and the constraints of CUDA,it gives a new logical organization of “grid-strip-group-block-thread” and the concepts of “based line” and “global based line”,based on which the parallel version of the Gauss-Jordan elimination algorithm on CUDA is proposed.The numerical experiment of test instances with max size 4 000 shows that the algorithm can utilize the advantage of the GPU and decrease the computational time for the large-scale dense linear system effectively.

Key words: Compute Unified Device Architecture(CUDA), parallel algorithm, improved Gauss-Jordan elimination algorithm, large-scale dense linear system

摘要: 在Gauss-Jordan消去法的基础上,给出了一种适应于CUDA架构的改进Gauss-Jordan消去并行算法。通过分析该方法的处理过程以及CUDA架构的相应限制,在CUDA的grid-block-thread三层组织结构的基础上,从算法构造的角度提出了grid-strip-group-block-thread五层结构,给出了基础行以及全局基础行等概念,并构建了适应于CUDA架构的Gauss-Jordan消去法的并行版本,在最高维数为4 000维的大规模稠密线性方程组的算例求解上与串行Gauss-Jordan消去法进行了比较,实验结果表明,该算法能够充分利用GPU的硬件特性,有效地降低了大规模稠密线性方程组的求解时间。

关键词: 计算统一设备架构(CUDA), 并行算法, 改进Gauss-Jordan消去法, 大规模稠密线性方程组