English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 62819/95882 (66%)
造访人次 : 4007162      在线人数 : 562
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/87751


    题名: 對稱正定矩陣Cholesky分解之GPU加速研究
    其它题名: GPU-based Cholesky decomposition for symmetric positive definite matrices
    作者: 吳博雋;Wu, Bo-Jun
    贡献者: 淡江大學資訊管理學系碩士班
    魏世杰;Wei, Shih-Chieh
    关键词: Cholesky分解;圖形處理單元(GPU);Intel數學核心函式庫(MKL);Cholesky Decomposition;Graphics Processing Units;Intel Math Kernel Library
    日期: 2012
    上传时间: 2013-04-13 11:40:16 (UTC+8)
    摘要: 本研究旨在利用近期發展之圖形處理單元(GPU)硬體,提昇特定矩陣運算之效能。在求解線性最小平方差問題中,常需要計算共變異矩陣之反矩陣。這時利用共變異矩陣滿足對稱正定矩陣之特性,求解過程可使用Cholesky分解,其相較於適用一般矩陣之LU分解,速度還快上一倍。近年來隨著技術進步,一片圖形卡中常可置入數百核作運算,相較於CPU之少數8核或16核,若能將圖形卡視為一般用途圖形處理單元(GPGPU),將可加速執行一般可平行化運算。目前文獻中已有許多矩陣運算平行化之研究,本文將針對求對稱正定矩陣逆運算常用之Cholesky分解,找尋現有網路上公開之GPU實作原始碼,分析其結構,評比其表現,並提出改良方法。經過測試,相較於純CPU執行之Intel數學核心函式庫(Math Kernel Library,MKL)版本,本文改良法可透過GPU提昇維度10000方陣之Cholesky分解速度達3.5倍快程度。
    This work aims to apply the recently developed Graphics Processing Unit (GPU) to performance enhancement of a specific matrix operation. When solving the linear least squares problem, it is often necessary to compute the inverse of a covariance matrix. As the covariance matrix satisfies the condition of a symmetric positive definite matrix, the Cholesky decomposition can be used which is twice as fast as the LU decomposition on general matrices. In recent years, with the advances in technology, a graphics card can accommodate hundreds of cores compared to the small number of 8 or 16 cores on CPU. Therefore a trend is seen to use the graphics card as a general purpose graphics processing unit (GPGPU) for parallel computation. There are already many works on parallel matrix operations in the literature. This work will focus on the Cholesky decomposition commonly used in computing the inverse of a symmetric positive definite matrix. First, several open source GPU-based Cholesky decomposition programs on the Internet were located, analyzed, and evaluated. Then several strategies for performance improvement were proposed and tested. After experiments, compared to the CPU version using the Intel Math Kernel Library (MKL), our proposed GPU improvement strategy can achieve a speedup of 3.5x on the Cholesky decomposition of a square matrix of dimension 10,000.
    显示于类别:[資訊管理學系暨研究所] 學位論文

    文件中的档案:

    档案 大小格式浏览次数
    index.html0KbHTML221检视/开启

    在機構典藏中所有的数据项都受到原著作权保护.

    TAIR相关文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回馈